Skip to content

Instantly share code, notes, and snippets.

@hungson175
Created June 17, 2026 04:35
Show Gist options
  • Select an option

  • Save hungson175/6eba6d6cdb6bfd4cc22d350fbea30cfd to your computer and use it in GitHub Desktop.

Select an option

Save hungson175/6eba6d6cdb6bfd4cc22d350fbea30cfd to your computer and use it in GitHub Desktop.
finkit library review - financial engineer perspective

finkit Library Review — Financial Engineer Perspective

Repository: https://github.com/hungson175/finkit.git
Package: vn-finkit 0.1.0
Reviewed by: Financial-engineer tester (external)
Date: 2026-06-17
Environment: Linux x86_64, Python 3.12.12, fresh editable install from source


Executive Summary

finkit is an ambitious, well-architected Vietnamese-market data library. It replaces a vendored scraper kit with a typed spine: 13 provider adapters, 15 canonical schemas, automatic fallback, a FastAPI service, and a VCR-based offline test suite. The codebase shows strong engineering discipline (schema contracts, typed errors, secret redaction, drift detection).

However, as a downstream financial engineer I hit real, reproducible failures in the public consumer seam (Router) on the very first day of use. The most serious one is that Router.history("BTCUSDT") silently returns an empty frame because VN equity adapters return empty for crypto symbols instead of raising a retryable error, so the router never reaches Binance. A live drift test already exists for this case and currently fails.

The library is not production-ready for cross-asset work until router/asset-class routing is tightened. VN-equity workflows, on the other hand, are in much better shape.


1. Test Results

1.1 Offline unit + VCR integration suite

2747 passed, 59 deselected, 2 warnings in 27.15s
Coverage: 95.80% (fail_under = 90.0%)
  • All offline tests pass.
  • Coverage is excellent and honestly measured (branch = false, so this is line coverage).
  • Warnings are benign (Starlette deprecation, pandas date-parsing fallback in vendored oracle).

1.2 Live drift suite (pytest tests/live --run-live)

48 passed, 8 skipped, 2 failed in 7.80s

Failures:

  1. tests/live/test_live_router.py::TestRouterPaths::test_router_history_crypto
    Router.history("BTCUSDT", ...) returns an empty DataFrame; the live sanity check asserts non-empty and fails.
  2. tests/live/test_live_spine.py::TestKBS::test_financial_statement
    KBS financials endpoint returns HTTP 404 (No context-path matches the request URI).

1.3 Cassette freshness

tests/test_cassette_freshness.py::test_no_cassette_exceeds_ttl PASSED

Cassettes are within the 30-day TTL. (This was run on a fresh clone; git commit dates are used.)

1.4 Static analysis

ruff check src/finkit tests/finkit  ->  18 errors (E731, F401)

None are critical, but the repo currently does not pass its own linter. They are all auto-fixable.


2. What Works Well

2.1 Architecture & contracts

  • Canonical schemas (finkit.core.schema) are a genuine improvement over the vendored code. Every frame carries the same column order, dtypes, timezone, provenance, and enum-domain checks.
  • Typed error hierarchy is clean and machine-readable (is_retryable, category).
  • One transport handles retries, rate limits, secret redaction, and body capture. The Retry-After date/delta parsing is a nice touch.
  • Capability protocols instead of a 14-method ABC mean adapters only implement what they serve.
  • REST API is thin, well-structured, and maps errors to stable HTTP status codes.

2.2 Data correctness (when the right adapter is reached)

  • VCI.history("FPT"), VCI.history("VNINDEX"), KBS.price_board([...]), KBS.listing("HOSE"), VND.market_movers/valuation, MBK.macro, SPL.commodity, and FMARKET.fund_* all returned real, schema-valid data from this non-VN IP.
  • Binance direct calls (adapter.history, adapter.intraday, adapter.price_depth) return correct OHLCV/order-book data.
  • OHLC integrity checks (high ≥ low, close inside range, no NaN prices, non-negative volume) pass on Binance data.

2.3 Input validation

  • Invalid symbols, unsupported intervals, and inverted date ranges all raise InvalidInputError before any network call.
  • Symbol allowlist ^[A-Z0-9]{3,9}$ is enforced, preventing URL injection.

2.4 Documentation

  • README.md, docs/user-guide/, docs/design/, and inline docstrings are unusually thorough.
  • The offline-green is NOT live-green warning about VCR staleness is exactly the right message.

3. High-Severity Issues

3.1 Router silently returns empty frames for cross-asset symbols ⭐ CRITICAL

Reproduction:

from finkit.api.deps import build_router
router = build_router()

# Returns 0 rows
router.history("BTCUSDT", start="2026-05-01", end="2026-06-01", interval="1D")

# Works fine
router.registry.get("BINANCE").history("BTCUSDT", start="2026-05-01", end="2026-06-01", interval="1D")

Root cause:

DEFAULT_ORDER["history"] is ["VCI", "KBS", "MAS", "FMP", "BINANCE"]. VCI is called first. VCI does not validate that the symbol belongs to a VN equity/index before sending the request; the upstream VCI endpoint returns an empty columnar payload for BTCUSDT, which the adapter converts to a canonical empty frame. The router treats an empty frame as an authoritative "zero rows" answer and does not fall back.

The same happens with KBS (also returns 0 rows); MAS raises HTTP 400, which is non-retryable, so the chain stops before reaching Binance anyway.

Why this matters:

A financial engineer consuming the public Router API cannot fetch crypto history without explicitly pinning source=BINANCE. The advertised cross-asset convenience (mkt.history("BTCUSDT", ...)) is broken.

Evidence:

Live drift test test_router_history_crypto fails exactly here:

AssertionError: Router.history(BTCUSDT): zero rows — endpoint returned no data

Recommended fix:

  1. Adapter-side asset-class guard: VN equity adapters should check self.registry.asset_type(symbol) and raise CapabilityNotSupported (retryable) for AssetType.CRYPTO before hitting the network. They can never serve crypto.
  2. Router-level capability routing: Alternatively, add an asset_type aware dispatch so the router can skip VN brokers for crypto symbols. This is more robust because it works even if a broker returns a misleading empty frame.
  3. Consider treating an empty OHLCV frame from a provider that does not claim the asset class as retryable, or at least logging a warning that fallback was aborted.

3.2 KBS financials endpoint is dead, and fallback does not rescue it ⭐ HIGH

Reproduction:

router.financial_statement("FPT", "income_statement", period="year", lang="en")
# raises ProviderHTTPError: HTTP 404 from KBS

MAS.financial_statement("FPT", ...) works fine (337 rows), but the router never reaches it because KBS is tried first and a 404 is classified as non-retryable.

Root cause:

The transport treats all 4xx as authoritative caller/endpoint errors. That is usually correct (delisted symbol, wrong URL), but a 404 from a known-dead endpoint should trigger fallback.

Recommended fix:

  • Distinguish "endpoint unreachable / dead" from "symbol not found." One pragmatic approach: if a provider consistently 404s across multiple symbols in the live drift suite, mark the provider as unhealthy for that capability and skip it in the router. Or, make ProviderHTTPError.is_retryable context-aware for dead-endpoint detection.
  • Short-term: reorder financials to put MAS first, or remove KBS from financials until the endpoint is restored. The live suite already flags this.

3.3 REST API returns HTTP 500 for an unregistered source ⭐ MEDIUM

Reproduction:

GET /v1/history?symbol=BTCUSDT&start=2026-05-01&end=2026-06-01&source=FMP
# 500 {"error":"FinkitError","detail":"No provider registered for source 'FMP'"}

A bad source parameter is a caller error and should return 4xx (e.g., 400 or 422), not 500.

Recommended fix:

Catch the registry lookup error in the route and raise InvalidInputError("source", ...) so it maps to 422.


4. Medium-Severity Issues

4.1 Router fallback for non-VN symbols is fragile by design

The history capability mixes VN equity, VN index, and crypto providers in one fallback chain. This is convenient but requires every upstream adapter to fail in a retryable way for symbols it cannot serve. The current design trusts adapters to do this; at least VCI and KBS do not.

Recommended fix:

Consider a per-asset-class dispatch:

if registry.asset_type(symbol) == AssetType.CRYPTO:
    order = ["BINANCE"]
elif registry.asset_type(symbol) == AssetType.INDEX and symbol in GLOBAL_INDICES:
    order = ["FMP", "STOOQ"]
else:
    order = DEFAULT_ORDER["history"]

This removes the implicit assumption that VN brokers will refuse crypto symbols cleanly.

4.2 source_tz default in _to_vn_naive is UTC, but normalize_market_valuation defaults to VN_TZ

The default is inconsistent across normalizers. While documented, it is an easy foot-gun for adapter authors. A single default (or explicit required argument) would reduce risk.

4.3 Empty quote/depth for crypto when routed

Same class as 3.1: router.price_depth("BTCUSDT") returns 0 rows because VN brokers are tried first. Pinning source=BINANCE works.

4.4 fmp.py / fred.py live paths are not exercised by default

Because keys are not shipped, these adapters are not registered unless env vars are set. The offline tests cover them with VCR/synthetic data, but there is no CI path that proves the keyed integration still works. Add a nightly or keyed CI job.


5. Low-Severity / Polish Issues

5.1 Static analysis failures

ruff check reports 18 errors, mostly E731 (lambda assignments) and F401 (unused imports). These should be fixed and added to CI.

5.2 Starlette / httpx deprecation warning

fastapi.testclient warns that httpx is deprecated in favor of httpx2. This will become a hard failure in a future FastAPI/Starlette release.

5.3 manual_review.py UTC deprecation

Minor: datetime.utcnow() is deprecated in Python 3.12. Review scripts should use datetime.now(timezone.utc).

5.4 validate_frame uses assert for contract enforcement

Using assert for production validation means it can be disabled with python -O. Financial correctness checks (OHLC integrity, enum domains, primary-key uniqueness) should use explicit if ...: raise AssertionError(...) or dedicated exceptions.

5.5 Coverage is line-only

branch = false in pyproject.toml. For a library that relies heavily on fallback branches and error paths, branch coverage should be enabled and the floor raised gradually.


6. Financial-Engineer Workflow Checklist

Task Via Router Direct adapter Notes
VN stock OHLCV (FPT) Works
VN index OHLCV (VNINDEX) Works
Crypto OHLCV (BTCUSDT) Router bug (Issue 3.1)
Crypto depth Router bug
Global index (SPX) ⚠️ ⚠️ Stooq WAF from datacenter; FMP needs key
VN market movers/valuation Works
VN macro (CPI) Works
VN funds Works
Commodities (gold) Works
Financial statements ✅ (MAS) KBS dead, no fallback
Company overview Works
Listing Works
Price board Works

7. Recommendations for the Development Team

Immediate (block release)

  1. Fix the crypto router bug by adding asset-class guards in VN equity adapters or asset-class-aware dispatch in Router. Re-run test_router_history_crypto live until it passes.
  2. Fix the KBS financials dead endpoint either by removing KBS from the financials order, making dead-endpoint 404s retryable, or restoring the endpoint.
  3. Return 4xx instead of 500 for an invalid/unknown source query parameter in the REST API.

Short-term

  1. Enable branch = true in coverage and add branch-level tests for router fallback paths.
  2. Fix all ruff errors and add a lint job to CI.
  3. Add a keyed-provider CI job (FMP + FRED) that runs at least weekly.
  4. Update datetime.utcnow() usages to timezone-aware equivalents.

Long-term

  1. Consider splitting the monolithic history capability into history (VN equity/index) and crypto_history, or make the router asset-class-aware, to eliminate the whole class of cross-asset fallback bugs.
  2. Add an "endpoint health" monitor that blacklists a provider+capability after repeated hard failures, rather than relying solely on static DEFAULT_ORDER.
  3. Document the exact semantics of empty frames vs NoDataError vs 404 more prominently for API consumers.

8. Conclusion

finkit is a promising, well-tested foundation with clear architectural intent. For VN equity/index workflows it already delivers clean, schema-valid data with good error handling. For cross-asset and crypto workflows routed through the public Router, it currently fails silently and needs urgent fixes before it can be trusted in production.

The good news is that the failures are concentrated in the fallback/routing layer, not in the provider parsers or normalizers. Fixing Issues 3.1–3.3 would raise the library from "VN-only usable" to "genuinely cross-asset" and make the live drift suite green again.


Review generated by running: pytest -q, pytest --cov=src/finkit, pytest tests/live --run-live, live API probes against Binance/VCI/KBS/MAS/VND/MBK/SPL/FMARKET, and ruff check src/finkit tests/finkit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment