CoinGradingApp tests coin grading apps for returning hobbyists who want to understand why an app assigned a grade — not just accept a number. We score grading apps on strike-type intelligence, transparency, and honest handling of cleaned or damaged coins.
Who We Are
Two of us walked away from hand-grading coins in the early 2000s. We sold most of our collections, forgot half of what we knew about the Sheldon scale, and stepped away from the hobby for years. Then AI happened. When we came back to numismatics in 2022, we found ourselves holding the same Lincoln cents and Morgan dollars we remembered, but facing a wall of grading apps claiming certainty we didn't feel. One of us trusted an app's AI grade on a Proof Lincoln cent, submitted it for authentication based on that grade, and paid $45 for grading — only to receive a grade two points lower. That taught us a hard lesson: AI grading tools are useful, but only if you understand what they're actually seeing and where they fail. That's why we started testing grading apps the way we wish they'd been built.
Methodology
We test each grading app against 38 coins across four categories: business strikes (Lincoln wheat cents, Mercury dimes, Standing Liberty quarters), Proof strikes (Franklin halves, early-date Proofs), Specimen strikes (rare SMS issues), and problem coins (cleaned, lightly gouged, or environmental damage). We spend between 60 and 90 hours per app over 12-16 weeks, testing at least three coins in each strike-type category multiple times — once with clean photos, once with angled light to reveal striking wear, once with a coin partially obscured to test the app's handling of incomplete data. We compare the app's assigned grade against published PCGS and NGC grades for the same coins, and we track how often the app's reasoning (if provided) aligns with Sheldon scale definitions.
Our Standards
We believe AI grading tools should handle the rare-strike scenarios because that's where most returning hobbyists get surprised. A business strike Lincoln cent is straightforward. A Proof is not — Proof coins have a different surface finish, different wear patterns, and different grading criteria. Specimen strikes (SMS coins from the 1960s, for example) are rarer still, and most grading apps either misidentify them as Proofs or assign a business-strike grade to a coin that should never have been processed that way. We score apps heavily on their ability to recognize strike type from the photo alone, and we penalize apps that require you to select 'Proof' or 'Business Strike' manually before they'll grade — that defeats the purpose of AI recognition. Beyond strike type, we look at how apps handle the in-between cases: lightly cleaned coins, coins with environmental toning that mimics wear, and coins with specific striking weaknesses (die cracks, short strikes) that amateur eyes often misread as damage. An app that admits 'this coin is too obscured to grade safely' is more trustworthy than one that forces a number anyway.
Disclosure
We do not accept paid placement or sponsor relationships with app developers; we do not review grading apps we have not tested hands-on for at least 60 hours across at least 25 coins; we do not claim that AI grading can replace in-hand authentication by a third-party service, and we do not score apps on strike-type handling if they fail to distinguish between business strikes and Proofs in at least 90% of our test cases. We also do not test every grading app on the market — we focus on tools with active user bases and consistent updates, and we do not claim expertise in ancients, world coins, or hyper-specialized varieties beyond our test set of US circulating and commemorative issues.
Contact
If you've tested a grading app and found something surprising — or if you're a developer with a tool you'd like us to evaluate — contact us through the site's contact form. We also welcome suggestions for specific coins or strike-type scenarios you'd like us to include in future test rounds.