Formats, updates, refunds, commercial use, and the legal stuff.
Getting started
No — packages 1, 2, 3, and 4 include CSV and Excel formats, so you can open them in Excel, Google Sheets, or any BI tool. Packages 5, 6, and 7 are designed for Python / R / SQL users.
If you're an Excel user or just getting started — pick Package 1 or 4 (CSV + Excel). If you're a bettor building a model — pick Package 2 or 3. Python / R data scientist — Package 5 or 6 (Parquet). Want everything in every format, including a pre-loaded SQLite — Package 7 (Complete Diamond).
In Python: pd.read_parquet("file.parquet"). In R: arrow::read_parquet(). In SQL: DuckDB reads parquet directly. In Excel: convert via the package's CSV copy (every parquet file in packages 3, 4, 7 has a matching CSV).
Package 1 ≈ 28 MB · Package 2 ≈ 27 MB · Package 3 ≈ 47 MB · Package 4 ≈ 93 MB · Package 5 ≈ 2.7 GB · Package 6 ≈ 3.1 GB · Package 7 ≈ 6.5 GB. The big ones include all 12 seasons of Statcast pitch-by-pitch.
Updates
Every package is updated annually each November. The latest version is always available at current pricing. Buying the current version gets you the current version — future annual releases are separate purchases.
The new season is added, mid-season corrections from upstream sources are pulled in, derived stats (career_war_cumulative, head-to-head matchups, velocity fatigue curves, etc.) are recomputed, and any data dictionary updates are bundled in. The schema stays stable — your existing scripts keep working when you upgrade to the latest version.
Downloads & access
Your download link is in your receipt email — keep it safe. If you have issues contact Payhip support directly at payhip.com — we do not have access to your payment or account information.
Please don't. Each purchase is a single-user license. Download links are personal to your purchase. Sharing your link violates our terms.
Contact Payhip support directly at payhip.com — they can help you recover access. We do not have access to your payment or account information and cannot manually resend links.
Legal & usage
Yes. All data is compiled from publicly available sources. What you're paying for is the work product: cleaning, deduplicating, joining, enriching, and packaging in usable formats. See the disclaimer at the bottom of every page.
Yes for research, modeling, betting, fantasy, internal analytics, and editorial. Not for resale of the raw data itself. You can publish derived insights, charts, models, or articles based on the data, but you cannot bundle our files and resell them as a competing dataset product.
One purchase = a non-transferable license for you (or a single team / company) to use the data internally and to publish derived work. No redistribution of the raw files. No reselling. No exclusivity claims. Attribution is appreciated but not required for derivative work.
"Data compiled and enriched from publicly available sources for research and analytical purposes. Not affiliated with or endorsed by any sport data provider or league. All derived features, enrichments, and computed statistics are original work product."
Refunds
All sales are final. These are digital data files — once accessed or downloaded we cannot un-deliver the data, so we do not offer refunds under any circumstances. We strongly recommend downloading the free sample before purchasing to confirm the data meets your needs. If you experience a technical issue with your download contact [email protected] and we will make it right.
If a file is genuinely corrupted or fails to open, contact us at [email protected] with details. We will verify the issue and provide a working file. This is the only exception to our no-refunds policy — and it applies to technical delivery failures only, not change-of-mind purchases.
Custom requests
Contact us — we can discuss custom extracts, alternative formats (DuckDB, BigQuery, Snowflake), or adding fields to one of the existing bundles. Pricing is quoted per request.
Yes — single-purchase covers an internal team or research group. For full multi-seat redistribution rights, contact us for an enterprise license.
Yes — more sports are planned. The exact timeline depends on data availability. Keep your receipt email to stay informed, or check rawsportsvault.com for announcements.
Still have questions?
Email [email protected] for data questions, custom-dataset requests, and broken-file reports. For download-link, receipt, or payment issues, contact payhip.com directly — we do not have access to your payment or account information. We answer within one business day.