# BUSINESS.md - Business Case & Marketing Strategy

> **Creator-only document. Do not ship to buyers.**

**Version**: 1.6
**Last updated**: April 28, 2026
**Owner**: Michael

---

## 1. Executive Summary

Sell niche-specific Python automation tools as one-time downloadable digital products. Target non-technical users who hate repetitive Excel/CSV work but cannot code. Distribute via Gumroad / Lemon Squeezy / Stripe with automated instant delivery. Cross-platform from launch (Windows, macOS, Linux). Each bundle ships with both a GUI (primary surface for non-technical buyers, runs in the buyer's browser locally) and a CLI (for power users and automation).

**Pricing model**: One-time purchase. Individual bundles $49-$79. Full suite $149.

**Goal**: Lifestyle cashflow. No saleable-asset exit required.

---

## 2. Market Opportunity

- Persistent, evergreen pain: manual data cleaning is universal across small business and freelance work.
- Low competition in highly vertical niches (e.g., Shopify pet supplies feeds vs. generic CSV cleaners).
- High margin: near-100% gross margin after initial creation.
- Distribution leverage: marketplace search + community presence + programmatic SEO + **hosted browser demo as a try-before-buy conversion surface** (added v1.3, see Section 7).

**Realistic distribution timeline note**: Marketplace listings (Gumroad, Lemon Squeezy directory) and niche community posts can produce paying customers within days to weeks. New-domain SEO will not produce traction inside 90 days. Plan early-stage distribution around marketplaces, communities, and a hosted demo; treat owned-domain SEO as a 6-18 month compounding asset.

---

## 3. Target Customers

Primary:
- Shopify store owners (Pet Supplies niche identified as priority).
- Small business owners needing reporting and finance automation.
- Freelancers and consultants who handle client data.
- Local marketing agencies.

Anti-personas (do not waste effort here):
- Enterprise data teams (will build it themselves).
- Pure technical buyers (will pip install something free).

---

## 4. Product Strategy

**Lead product**: Excel & CSV Data Cleaning Mastery Bundle (highest-pain, broadest demand).

**Bundle roadmap**:
1. Data Cleaning Mastery (lead, in progress).
2. Automated Business Reporting.
3. Ecommerce Data Pipeline.
4. Small Business Finance.
5. Marketing Public Data Aggregation.
6. AI Ecommerce Aggregation - Shopify Pet Supplies (vertical niche play).

**Sequence rule**: Do not start bundle 2 until bundle 1 has paying customers and at least one external review. Building five skeleton bundles in parallel is a known failure mode.

**Product surface (locked v1.3)**: Each bundle ships as a desktop install (Windows / macOS / Linux) with both a Streamlit-based GUI and a CLI. A constrained version of the GUI is also deployed publicly to Streamlit Community Cloud as a free browser demo. See TECHNICAL.md Sections 1-3 and DECISIONS.md Section 4c for the full architecture.

---

## 4a. Lead Bundle Deep Dive: Deduplicator Use Cases & Competitive Position (added v1.6)

The deduplicator is the lead because it has the highest pain density across all four target personas. This section captures the use-case map, competitive landscape, and market gap statement. It feeds landing page copy, demo dataset design, and feature prioritization. Companion technical spec is in TECHNICAL.md Section 10.1.

### Use cases by buyer persona

**Shopify store owner (priority niche)**
1. Customer list cleanup: same person with `john@gmail.com` and `John@Gmail.com` and `j.ohn@gmail.com` (Gmail ignores dots), or with two phone formats.
2. Product catalog dedup: same SKU listed with trailing whitespace, case differences, or near-identical names ("Dog Collar - Red - Large" vs "Dog Collar Red L").
3. Abandoned cart cleanup before re-engagement campaign (don't email the same person 3 times).
4. Order export consolidation when pulling from Shopify + Amazon + manual entry.
5. Subscriber list hygiene before importing to Klaviyo / Mailchimp (every duplicate costs money on per-contact pricing).

**Small business / bookkeeper**
6. Bank export reconciliation: same transaction appearing in two exports across overlapping date ranges.
7. Vendor list consolidation across QuickBooks, spreadsheets, and email.
8. Customer master record cleanup before invoicing migration.
9. Expense report dedup where employees submit the same receipt twice.

**Freelancer / consultant**
10. Pre-analysis cleanup of client-supplied data dumps (almost always have dupes).
11. Survey response dedup (same respondent submitting twice from different devices).
12. Lead list cleanup before client handoff.

**Marketing agency**
13. Email list deduplication across multiple lead sources before campaign send.
14. Audience reconciliation when running multi-platform campaigns (Facebook + Google + organic forms).
15. Suppression-list management (combine unsubscribes across lists).

**Highest-pain, highest-frequency**: 1, 5, 6, 13. Build the feature set, sample dataset, and demo around these first. Landing page copy should lead with these scenarios. The hosted demo's pre-loaded dataset should make at least two of these cases obvious within ten seconds.

### Competitive landscape

| Tool | Price | Strength | Weakness vs. this product |
|---|---|---|---|
| Excel "Remove Duplicates" | Free | Universal, zero install | Exact match only. No fuzzy. No audit log. |
| Pandas `drop_duplicates` | Free | Powerful | Requires Python skill. Buyer doesn't have it. |
| OpenRefine | Free | Powerful clustering, fuzzy | Steep learning curve, dated GUI, intimidating for non-technical users. |
| Dedupe.io | ~$30+/mo SaaS | ML-based fuzzy | Recurring cost, cloud upload (privacy concern for client data), overkill for small jobs. |
| WinPure / Data Ladder | $300-2000+ | Enterprise-grade | Wrong price tier and complexity for solo operators. |
| Power Query (Excel) | Free | Integrated | Exact match only, no fuzzy without M-code skill. |

### The market gap this product fills

The market has a hole between "Excel (too basic)" and "OpenRefine / Dedupe.io (too complex or expensive or cloud-bound)." That hole is:

> Fuzzy match quality of OpenRefine, with the zero-learning-curve UX of Excel, sold once for under $100, runs locally on the buyer's machine.

This is a defensible position **only if** the product delivers fuzzy match quality the buyer can trust without reading documentation. If fuzzy is mediocre, the product loses to free Excel. If UX requires learning, it loses to free OpenRefine. The Tier 1 functional spec in TECHNICAL.md Section 10.1 is the minimum viable feature set to occupy this gap.

### Pricing sanity check (lead bundle specifically)

$49-$79 is correct for this position. Above $99 the buyer expects SaaS support (which conflicts with the no-touch constraint). Below $30 it competes with free and signals "toy." See Section 5 for full pricing rationale.

---

## 5. Pricing

| Tier | Price | Notes |
|---|---|---|
| Single bundle | $49 - $79 | Sweet spot for individual buyer impulse purchase |
| Full suite (when 3+ bundles exist) | $149 | Anchor price, drives bundle attach |

Rationale: $49-$79 is below the threshold that triggers procurement / approval friction for solo operators. Above $99 the buyer expects a SaaS or human support.

---

## 6. Revenue Targets (realistic, tiered)

Replacing the original "$50k/mo ceiling" target with evidence-based tiers for solo digital product sellers in this category:

| Horizon | Target | Notes |
|---|---|---|
| First 90 days | First paying customer | Validates the funnel, not the business |
| 6 months | $1,000 - $3,000 / mo | Achievable with working lead bundle + marketplace presence + hosted demo |
| 12 months | $5,000 / mo | Realistic 12-month goal. Triggers re-evaluation of the "fully async" constraint (see Section 8) |
| 24 months | $10,000 / mo | Stretch target. Requires either a hit product or 3+ bundles compounding |

$20k+/mo is achievable but requires a channel asset (audience, brand, content) that the current operator constraints exclude. Not a target.

---

## 7. Marketing Strategy

**Channels (in priority order, early stage)**:
1. **Hosted browser demo** (added v1.3). Free, public Streamlit Community Cloud deployment of a constrained version of each bundle. Linked prominently from every Gumroad / Lemon Squeezy listing and the landing page as "Try it free in your browser." Direct conversion lever: prospects can validate quality before purchase, which is otherwise impossible for digital downloads at this price.
2. Marketplace listings (Gumroad search, Lemon Squeezy directory, GitHub).
3. Niche community presence (subreddits, Indie Hackers, niche Slack/Discord) - value-first posts, not promotion. The hosted demo doubles as the asset shared in these posts.
4. Programmatic SEO landing pages targeting long-tail keywords (compounds over months).
5. Strong GitHub README as discovery surface.

**Hosted demo design**:
- Same core engine as the paid product, GUI front-end only.
- Constrained: row limit (e.g., 100 rows on output), watermark on output files, sample dataset preloaded plus optional small-file upload (capped size).
- Persistent CTA on every page: "Like what you see? Get the full version for $49 ->" linking to Gumroad.
- No login or signup required to use the demo. Friction kills conversion.
- Hosted on Streamlit Community Cloud (free) at launch. Migrate to $5/mo VPS if rate limits or branding constraints become an issue.

**Target keywords (long-tail, low competition)**:
- python csv cleaner bundle
- excel data cleaning scripts
- automated data deduplicator python
- csv duplicate removal tool
- shopify product feed cleaner

**Funnel**:
- Discovery (marketplace search / community post / SEO) -> Hosted demo (try-before-buy) -> Landing page -> Gumroad checkout -> Stripe payment -> automated email delivery -> upsell sequence to next bundle.

**Support model**: Self-serve documentation in every download. Email support only, no live chat, no calls.

---

## 8. The "Fully Async, No-Touch" Constraint

The locked criteria require fully automated, no-touch marketing and sales. This is preserved as the long-term steady state. However:

**Revisit trigger**: When monthly recurring revenue reaches $5,000/mo.

**Why revisit**: At early stage, the no-touch constraint rules out the channels most likely to produce first traction (direct outreach to 50 Shopify pet operators, founder-led community participation, customer interviews). These are time-bounded activities, not permanent commitments. Strict adherence to "no-touch" before product-market fit may cost more revenue than it saves time.

**Action at trigger**: Re-evaluate whether selective non-async activity (e.g., 2 hours/week of community participation, or a small founder audience build) would compound revenue faster than additional bundle development. Decision is yours; this document only flags the trigger.

Until $5k/mo, operate under the locked async-only rule.

---

## 9. Cost Structure

**Recurring (monthly budget cap: $1,200)**:

| Item | Cost | Notes |
|---|---|---|
| Gumroad / Lemon Squeezy fees | ~10% of revenue | Net of merchant fees, no flat cost |
| Domain | ~$15/yr | One-time annual |
| Hosting (landing pages) | $0 - $20/mo | Static hosting via Cloudflare Pages, Netlify, or GitHub Pages is free |
| Hosting (browser demos) | $0 at launch | Streamlit Community Cloud free tier. Plan for $5-10/mo VPS migration if scale or branding requires |
| Email service (transactional + sequences) | $0 - $30/mo | Free tier covers early volume |
| Apple Developer Program | $99/yr (~$8/mo) | Required for macOS code signing - see Section 10 |
| Inno Setup (Windows installer) | Free | One-time download |
| PyInstaller, Streamlit, Python tooling | Free | All open source |
| **Total fixed monthly** | **~$30-70/mo** | Well under $1,200 cap |

Headroom in the budget allows for optional ad spend ($100-200/mo) once a bundle has proven conversion data.

---

## 10. macOS Code Signing (Apple Developer Program)

**Required cost**: $99/year, paid to Apple.

**Why it's required**:
macOS includes a security layer (Gatekeeper) that blocks unsigned applications by default. When a non-technical buyer downloads an unsigned `.app` or `.dmg`, macOS shows a hard-block dialog: *"This app cannot be opened because the developer cannot be verified."* The only obvious button is "Move to Trash."

The bypass exists (right-click > Open, then confirm in a second dialog), but the target buyer persona will not perform it. The likely outcomes for unsigned Mac builds: refund requests, support tickets, or silent abandonment.

**What the $99/yr buys**:
- A code signing certificate. Removes the hard block.
- Notarization service (included). Apple scans the binary and stamps it; this removes the secondary "downloaded from internet" warning too. Result: clean double-click-to-run experience.

**Setup notes**:
- Requires Apple ID + government ID (for individuals) or D-U-N-S number (for organizations).
- First-time approval takes 1-2 weeks. Plan accordingly.
- Once approved, signing and notarization is automated in the build pipeline (see TECHNICAL.md).

**Decision**: Pay for it. The cost is trivial relative to the conversion-rate impact for the non-technical buyer persona.

---

## 11. Risks & Mitigation

| Risk | Mitigation |
|---|---|
| Commoditization (free scripts on GitHub) | Niche verticals + polished GUI + cross-platform installers + hosted demo |
| Slow early traction | Lead with hosted demo + marketplaces + communities, not own-domain SEO |
| Refund chargebacks | Clear scope on landing page, hosted demo lets buyers validate before purchase, working samples included |
| macOS install friction | Apple Developer Program ($99/yr), code signing + notarization |
| Browser-launch UX confusion (GUI opens in browser locally) | Single sentence in installer welcome and email; persistent in-app "runs locally, no internet used" message; pywebview native-window wrap as v1.1 enhancement if needed |
| Customer support burden | Robust installers, idiot-proof docs, sample data included, hosted demo lets prospects self-evaluate |
| IP theft / resale | License file. Accept this is partial protection; focus on staying ahead via updates |
| Platform risk (Gumroad / Lemon Squeezy policy change) | Multi-marketplace from day one; own domain as fallback |
| Streamlit project direction change breaks desktop packaging | Low probability; flagged as criteria-relock trigger in DECISIONS.md Section 8 |

---

## 12. Success Metrics

Tracked monthly:
- Units sold per bundle.
- Conversion rate (landing page -> purchase).
- **Demo-to-purchase conversion rate** (added v1.3): hosted demo visits -> Gumroad clicks -> purchases.
- Refund rate (target < 5%).
- Support tickets per 100 sales (target < 10).
- Organic traffic to product pages.
- Per-platform install success rate (Windows, macOS, Linux).

---

## 13. Honest Status (April 28, 2026)

- 1 of 9 scripts is real and tested (`01_deduplicator.py`). The other 8 are skeletons. **Expected at project start.**
- Cross-platform build pipeline (PyInstaller-based) designed but not yet built.
- macOS code signing not yet set up (Apple Developer Program enrollment pending).
- Streamlit GUI not yet built (locked as the framework as of v1.3).
- Hosted demo not yet deployed.
- No paying customers yet.
- No live landing page yet.

**Next concrete steps before any marketing spend**:
1. Build the Streamlit GUI for the lead script (`01_deduplicator.py`). Apply UX standards from DECISIONS.md Section 4b.
2. Stand up the PyInstaller cross-platform build pipeline with Streamlit launcher (see TECHNICAL.md Sections 3.3 and 3.4). Budget 1-3 days for first-time Streamlit-PyInstaller integration.
3. Deploy the constrained demo version to Streamlit Community Cloud.
4. Enroll in Apple Developer Program (1-2 week lead time - start in parallel with the above).
5. Stand up a single landing page for the lead bundle, with the hosted demo prominently linked.
6. Finish at least 2 more of the 9 scripts to working state with both CLI and GUI.
7. List on Gumroad with sample output proof, per-platform installer downloads, and hosted demo link.
