Repository: https://github.com/hungson175/finkit.git
Package: vn-finkit 0.1.0
Reviewed by: Financial-engineer tester (external)
Date: 2026-06-17
Environment: Linux x86_64, Python 3.12.12, fresh editable install from source
finkit is an ambitious, well-architected Vietnamese-market data library. It replaces a vendored scraper kit with a typed spine: 13 provider adapters, 15 canonical schemas, automatic fallback, a FastAPI service, and a VCR-based offline test suite. The codebase shows strong engineering discipline (schema contracts, typed errors, secret redaction, drift detection).
However, as a downstream financial engineer I hit real, reproducible failures in the public consumer seam (Router) on the very first day of use. The most serious one is that Router.history("BTCUSDT") silently returns an empty frame because VN equity adapters return empty for crypto symbols instead of raising a retryable error, so the router never reaches Binance. A live drift test already exists for this case and currently fails.
The library is not production-ready for cross-asset work until router/asset-class routing is tightened. VN-equity workflows, on the other hand, are in much better shape.
2747 passed, 59 deselected, 2 warnings in 27.15s
Coverage: 95.80% (fail_under = 90.0%)
- All offline tests pass.
- Coverage is excellent and honestly measured (
branch = false, so this is line coverage). - Warnings are benign (Starlette deprecation, pandas date-parsing fallback in vendored oracle).
48 passed, 8 skipped, 2 failed in 7.80s
Failures:
tests/live/test_live_router.py::TestRouterPaths::test_router_history_crypto
Router.history("BTCUSDT", ...)returns an empty DataFrame; the live sanity check asserts non-empty and fails.tests/live/test_live_spine.py::TestKBS::test_financial_statement
KBS financials endpoint returns HTTP 404 (No context-path matches the request URI).
tests/test_cassette_freshness.py::test_no_cassette_exceeds_ttl PASSED
Cassettes are within the 30-day TTL. (This was run on a fresh clone; git commit dates are used.)
ruff check src/finkit tests/finkit -> 18 errors (E731, F401)
None are critical, but the repo currently does not pass its own linter. They are all auto-fixable.
- Canonical schemas (
finkit.core.schema) are a genuine improvement over the vendored code. Every frame carries the same column order, dtypes, timezone, provenance, and enum-domain checks. - Typed error hierarchy is clean and machine-readable (
is_retryable,category). - One transport handles retries, rate limits, secret redaction, and body capture. The
Retry-Afterdate/delta parsing is a nice touch. - Capability protocols instead of a 14-method ABC mean adapters only implement what they serve.
- REST API is thin, well-structured, and maps errors to stable HTTP status codes.
VCI.history("FPT"),VCI.history("VNINDEX"),KBS.price_board([...]),KBS.listing("HOSE"),VND.market_movers/valuation,MBK.macro,SPL.commodity, andFMARKET.fund_*all returned real, schema-valid data from this non-VN IP.- Binance direct calls (
adapter.history,adapter.intraday,adapter.price_depth) return correct OHLCV/order-book data. - OHLC integrity checks (high ≥ low, close inside range, no NaN prices, non-negative volume) pass on Binance data.
- Invalid symbols, unsupported intervals, and inverted date ranges all raise
InvalidInputErrorbefore any network call. - Symbol allowlist
^[A-Z0-9]{3,9}$is enforced, preventing URL injection.
README.md,docs/user-guide/,docs/design/, and inline docstrings are unusually thorough.- The offline-green is NOT live-green warning about VCR staleness is exactly the right message.
Reproduction:
from finkit.api.deps import build_router
router = build_router()
# Returns 0 rows
router.history("BTCUSDT", start="2026-05-01", end="2026-06-01", interval="1D")
# Works fine
router.registry.get("BINANCE").history("BTCUSDT", start="2026-05-01", end="2026-06-01", interval="1D")Root cause:
DEFAULT_ORDER["history"] is ["VCI", "KBS", "MAS", "FMP", "BINANCE"]. VCI is called first. VCI does not validate that the symbol belongs to a VN equity/index before sending the request; the upstream VCI endpoint returns an empty columnar payload for BTCUSDT, which the adapter converts to a canonical empty frame. The router treats an empty frame as an authoritative "zero rows" answer and does not fall back.
The same happens with KBS (also returns 0 rows); MAS raises HTTP 400, which is non-retryable, so the chain stops before reaching Binance anyway.
Why this matters:
A financial engineer consuming the public Router API cannot fetch crypto history without explicitly pinning source=BINANCE. The advertised cross-asset convenience (mkt.history("BTCUSDT", ...)) is broken.
Evidence:
Live drift test test_router_history_crypto fails exactly here:
AssertionError: Router.history(BTCUSDT): zero rows — endpoint returned no data
Recommended fix:
- Adapter-side asset-class guard: VN equity adapters should check
self.registry.asset_type(symbol)and raiseCapabilityNotSupported(retryable) forAssetType.CRYPTObefore hitting the network. They can never serve crypto. - Router-level capability routing: Alternatively, add an
asset_typeaware dispatch so the router can skip VN brokers for crypto symbols. This is more robust because it works even if a broker returns a misleading empty frame. - Consider treating an empty OHLCV frame from a provider that does not claim the asset class as retryable, or at least logging a warning that fallback was aborted.
Reproduction:
router.financial_statement("FPT", "income_statement", period="year", lang="en")
# raises ProviderHTTPError: HTTP 404 from KBSMAS.financial_statement("FPT", ...) works fine (337 rows), but the router never reaches it because KBS is tried first and a 404 is classified as non-retryable.
Root cause:
The transport treats all 4xx as authoritative caller/endpoint errors. That is usually correct (delisted symbol, wrong URL), but a 404 from a known-dead endpoint should trigger fallback.
Recommended fix:
- Distinguish "endpoint unreachable / dead" from "symbol not found." One pragmatic approach: if a provider consistently 404s across multiple symbols in the live drift suite, mark the provider as unhealthy for that capability and skip it in the router. Or, make
ProviderHTTPError.is_retryablecontext-aware for dead-endpoint detection. - Short-term: reorder financials to put
MASfirst, or removeKBSfrom financials until the endpoint is restored. The live suite already flags this.
Reproduction:
GET /v1/history?symbol=BTCUSDT&start=2026-05-01&end=2026-06-01&source=FMP
# 500 {"error":"FinkitError","detail":"No provider registered for source 'FMP'"}A bad source parameter is a caller error and should return 4xx (e.g., 400 or 422), not 500.
Recommended fix:
Catch the registry lookup error in the route and raise InvalidInputError("source", ...) so it maps to 422.
The history capability mixes VN equity, VN index, and crypto providers in one fallback chain. This is convenient but requires every upstream adapter to fail in a retryable way for symbols it cannot serve. The current design trusts adapters to do this; at least VCI and KBS do not.
Recommended fix:
Consider a per-asset-class dispatch:
if registry.asset_type(symbol) == AssetType.CRYPTO:
order = ["BINANCE"]
elif registry.asset_type(symbol) == AssetType.INDEX and symbol in GLOBAL_INDICES:
order = ["FMP", "STOOQ"]
else:
order = DEFAULT_ORDER["history"]This removes the implicit assumption that VN brokers will refuse crypto symbols cleanly.
The default is inconsistent across normalizers. While documented, it is an easy foot-gun for adapter authors. A single default (or explicit required argument) would reduce risk.
Same class as 3.1: router.price_depth("BTCUSDT") returns 0 rows because VN brokers are tried first. Pinning source=BINANCE works.
Because keys are not shipped, these adapters are not registered unless env vars are set. The offline tests cover them with VCR/synthetic data, but there is no CI path that proves the keyed integration still works. Add a nightly or keyed CI job.
ruff check reports 18 errors, mostly E731 (lambda assignments) and F401 (unused imports). These should be fixed and added to CI.
fastapi.testclient warns that httpx is deprecated in favor of httpx2. This will become a hard failure in a future FastAPI/Starlette release.
Minor: datetime.utcnow() is deprecated in Python 3.12. Review scripts should use datetime.now(timezone.utc).
Using assert for production validation means it can be disabled with python -O. Financial correctness checks (OHLC integrity, enum domains, primary-key uniqueness) should use explicit if ...: raise AssertionError(...) or dedicated exceptions.
branch = false in pyproject.toml. For a library that relies heavily on fallback branches and error paths, branch coverage should be enabled and the floor raised gradually.
| Task | Via Router |
Direct adapter | Notes |
|---|---|---|---|
| VN stock OHLCV (FPT) | ✅ | ✅ | Works |
| VN index OHLCV (VNINDEX) | ✅ | ✅ | Works |
| Crypto OHLCV (BTCUSDT) | ❌ | ✅ | Router bug (Issue 3.1) |
| Crypto depth | ❌ | ✅ | Router bug |
| Global index (SPX) | Stooq WAF from datacenter; FMP needs key | ||
| VN market movers/valuation | ✅ | ✅ | Works |
| VN macro (CPI) | ✅ | ✅ | Works |
| VN funds | ✅ | ✅ | Works |
| Commodities (gold) | ✅ | ✅ | Works |
| Financial statements | ❌ | ✅ (MAS) | KBS dead, no fallback |
| Company overview | ✅ | ✅ | Works |
| Listing | ✅ | ✅ | Works |
| Price board | ✅ | ✅ | Works |
- Fix the crypto router bug by adding asset-class guards in VN equity adapters or asset-class-aware dispatch in
Router. Re-runtest_router_history_cryptolive until it passes. - Fix the KBS financials dead endpoint either by removing KBS from the financials order, making dead-endpoint 404s retryable, or restoring the endpoint.
- Return 4xx instead of 500 for an invalid/unknown
sourcequery parameter in the REST API.
- Enable
branch = truein coverage and add branch-level tests for router fallback paths. - Fix all
rufferrors and add a lint job to CI. - Add a keyed-provider CI job (FMP + FRED) that runs at least weekly.
- Update
datetime.utcnow()usages to timezone-aware equivalents.
- Consider splitting the monolithic
historycapability intohistory(VN equity/index) andcrypto_history, or make the router asset-class-aware, to eliminate the whole class of cross-asset fallback bugs. - Add an "endpoint health" monitor that blacklists a provider+capability after repeated hard failures, rather than relying solely on static
DEFAULT_ORDER. - Document the exact semantics of empty frames vs
NoDataErrorvs 404 more prominently for API consumers.
finkit is a promising, well-tested foundation with clear architectural intent. For VN equity/index workflows it already delivers clean, schema-valid data with good error handling. For cross-asset and crypto workflows routed through the public Router, it currently fails silently and needs urgent fixes before it can be trusted in production.
The good news is that the failures are concentrated in the fallback/routing layer, not in the provider parsers or normalizers. Fixing Issues 3.1–3.3 would raise the library from "VN-only usable" to "genuinely cross-asset" and make the live drift suite green again.
Review generated by running: pytest -q, pytest --cov=src/finkit, pytest tests/live --run-live, live API probes against Binance/VCI/KBS/MAS/VND/MBK/SPL/FMARKET, and ruff check src/finkit tests/finkit.