What’s in This Dataset

This dataset contains 3,648 realistic synthetic bank transactions spanning one year, with data for five distinct accounts. Each transaction includes essential fields such as date, account_id, amount, merchant_name, category, running_balance, and transaction_type. The transactions are structured to mimic real-world spending behavior, including regular purchases, recurring bills, and occasional large expenses.

The dataset also includes a merchant_name column with realistic names like “Starbucks,” “Amazon,” and “Walmart,” along with category labels like “Food & Dining,” “Shopping,” “Utilities,” and “Transportation.” The running_balance column tracks account balances after each transaction, making it suitable for analyzing financial trends or building models that depend on cumulative data.

Data is formatted in clean CSV, with consistent date formatting and no missing values. Each record represents a real-world transaction and can be used to simulate user activity, test financial applications, or train machine learning models without privacy concerns.

Who Needs This Data

Developers building fintech applications need this dataset to test features like expense tracking, budgeting tools, and financial dashboards without risking real customer data. Data scientists training machine learning models for fraud detection, spending prediction, or category classification find it valuable because it includes realistic transaction patterns and labels.

Quality assurance testers use synthetic data to ensure their applications behave correctly under various financial scenarios. They can simulate edge cases, such as large purchases or negative balances, without relying on live data. This dataset provides a controlled and repeatable environment for testing.

Use Cases

  • Testing a personal finance app that categorizes transactions and displays spending trends
  • Building a machine learning model to predict monthly spending based on historical patterns
  • Validating a bank dashboard that shows running balances and upcoming bills
  • Training an AI system to detect unusual spending behavior or potential fraud
  • Simulating user behavior for a fintech SaaS product before going live
  • Creating a demo scenario for investors to showcase financial analytics capabilities

Loading It in Python

If you’re working with Python, loading this dataset is straightforward using pandas. The CSV file includes all necessary columns and is ready to explore. Here’s a quick snippet to get started:

import pandas as pd
df = pd.read_csv('1_year_synthetic_bank_transaction_data.csv')
print(df.head())
print(f"Shape: {df.shape}")
print(df.dtypes)

You’ll see the first few rows of transaction data, including date, account, amount, and other fields. The shape will confirm 3,648 rows and the data types will help you understand how to manipulate the data further.

Get the Dataset

Download 1 Year Synthetic Bank Transaction Data →
$39 one-time. Instant download. CSV format, ready to use.

More datasets and Python tools at OddShop