V0 Changelog#
0.49.0 (2026-05-26)#
Added#
DataProviderToolkit.drop_discrepant_processed_endpoint_tables_rowsfor dropping the discrepant rows
Changed#
FMP and LSEG fundamentals: on
CommonDataDiscrepancyError, drop the discrepant rows entirely instead of nulling their non-key columns and retrying.DataProviderToolkit.format_consolidated_discrepancy_table_for_outputis now a class methodAdded visual separators before and after discrepancy table logs for better visual separation and consistency
Fixed#
Handle FMP “Column ‘FundamentalDataRow.filing_date’ not found in table.” errors
DataProviderToolkit.consolidate_processed_endpoint_tablescrashed withpyarrow.lib.ArrowInvalid: Data type null is not supported in join non-key field __indicator_for_validitywhen an endpoint table had zero rows; the validity indicator array is now explicitly typed asbool.
Removed#
DataProviderToolkit.clear_discrepant_processed_endpoint_tables_rowsDataProviderToolkit._clear_table_rows_by_primary_key
0.48.1 (2026-05-04)#
Fixed#
MarketDataDailyRownon-negative value validation was leaking to fields added to extended entitiesEntity validation error messages now mention the correct entity name for extended entities
0.48.0 (2026-05-01)#
Changed#
Moved
__all__declarations to the top of the files
Fixed#
FMP fundamentals: handle all-null columns that would crash with
pyarrow.lib.ArrowInvalid: Data type null is not supported in join non-key fieldFMP fundamentals: handle duplicate statement periods crashing with
DataProviderToolkitRuntimeError: Primary key merge table contains duplicate rows.FMP fundamentals: handle some but not all endpoints having no rows, crashing with
KeyError: 'Field "FundamentalDataRow.filing_date$filingDate" does not exist in schema'FMP fundamentals: handle duplicate statement primary keys crashing with
pyarrow.lib.ArrowInvalid: Filter inputs must all be the same lengthFMP fundamentals: handle mismatched statement key columns crashing with
TypeError: '<' not supported between instances of 'NoneType' and 'datetime.date'
0.47.0 (2026-04-14)#
Added#
Official support for Python 3.14
Changed#
Breaking:
parameters_datacurator.xlsxconfiguration file renamed todata_curator_parameters.xlsxBreaking:
FundamentalDataRowCashFlow.net_cash_from_investing_activitesrenamed toFundamentalDataRowCashFlow.net_cash_from_investing_activitiesto fix typocustom_calculations.pytemplatec_testcalculation now uses split adjusted prices for better provider compatibilitySimplified Excel configuration entry script template
Include Yahoo Finance data provider in the production docker image
0.46.1 (2026-03-17)#
Fixed#
Handle error when FMP data provider returns empty market data
0.46.0 (2026-03-02)#
Added#
LSEG Workspace data provider
More data provider error exceptions
0.45.1 (2026-02-09)#
Fixed#
Handle error
pyarrow.lib.ArrowInvalid: Filter inputs must all be the same lengthwhen there’s duplicate filing dates for different statements in FinancialModelingPrep fundamental data
0.45.0 (2025-12-16)#
Changed#
Load ReadTheDocs dependencies from
pyproject.tomlinstead ofrequirements.txt
Fixed#
DataProviderToolkitentity field mapping methods fail on subclassed entitiesBaseDataBlockentity packing methods fail on subclassed entities
0.44.0 (2025-12-11)#
Added#
BaseDataEntityclass as new parent class for all data entitiesFirst partial implementation of data blocks, for generalizing data entity assembly from consolidated data tables
BaseDataBlockclass as data block parent class, containing common entity assembling logicData block classes for all current data entities
DataProviderToolkitfor generalizing the logic for constructing the consolidated data tables passed to the data blocks, including validation and better error handling/debuggingDependency on the
networkxlibrary for topological sorting and related usesDevcontainer configuration
Changed#
Moved dividend and split date and factor field declarations out of
ColumnBuilderand into the respective entity modulesStandardized the way in which data providers declare endpoints and their respective entity field to tag mappings, including tag preprocessors
Refactored
FinancialModelingPrepto use the new generalized entity creation and validation APIs
Deprecated#
services/entity_helper.pymodule will be removed in a future version, with all functionality being moved into theBaseDataBlockclass.`
0.43.1 (2025-09-09)#
Removed#
ta and pandas-ta unused dependencies
0.43.0 (2025-08-26)#
Changed#
Improved README file
Fixed#
DataCulumn division between int columns should return float column
0.42.0 (2025-08-04)#
Added#
DataProviderPaymentErrorandDataProviderConnectionErrorexceptionsFinancialModelingPrepnow has class methods for setting and getting whether the user’s account plan is paidpdm run docsscript as shortcut to the corresponding docs maker for the current OS. Use withpdm run docs html, etc.
Changed#
DataProviderInterface._request_data()now has special handling for 402 Payment Required errorsFinancialModelingPrepfundamental data methods now handle “The values for ‘limit’ must be between 0 and 5 based on your current subscription” 402 Payment Required errorsdata_curator.main()now handles uncaughtDataProviderPaymentErrorexceptions
0.41.0 (2025-07-02)#
Added#
DataColumn.__hash__method for hashing DataColumnsInMemoryOutputoutput handler for saving data to memoryCI: CODEOWNERS file for GitHub code review automation
Docs: Inserted hidden
toctreeentries in each categoryindex.rstto ensure all functions are registered in the Sphinx documentation build.Docs: Added Use Cases section, with links to the Data-Curator-Use-Cases repo
Changed#
Docs: Refactored
features_extension.pyto group calculation functions by category for documentation generation.Docs: Calculation functions are now organized in per-category folders under
api/, improving maintainability.Docs: Each category generates its own
index.rstfile with a clean table layout listing all functions.Docs: The Section Navigation panel now displays categories as expandable subsections instead of a flat list.
Docs: Added
:ref:-based linking for function references without affecting Excel configuration behavior.
0.40.2 (2025-06-05)#
Fixed#
Cli
init excelcommand not creating.envfile in theConfigfolderc_market_capcalculation now uses split-adjusted close price, to reduce the jumps right after a splitReadTheDocs integration basic setup
0.40.1 (2025-06-04)#
Fixed#
OS Error on cli
init scriptbecause of entry script template missing from wheel data
0.40.0 (2025-06-04)#
Changed#
First public release, now on PyPI
0.39 (2025-06-03)#
Changed#
tickerhas been renamed tomain_identifierin Excel configuration and all related codeTickerentity is now aMainIdentifierentity, withTicker.symbolbecomingMainIdentifier.identifierMainIdentifier.identifiernow only validates that no whitespace is presentTickerNotFoundErroris nowIdentifierNotFoundErrorConfiguration entity no longer validates identifiers format, as different data providers will have different required formats
fi_prefix for income statement columns now becomesfis_Versioning changed to remove
1.0bprefix, as version 1.0 still requires quite a few features.FundamentalDataRowIncomeentity is nowFundamentalDataRowIncomeStatementDataProviderInterface.init_config()is nowDataProviderInterface.initialize()DividendDataRow.adjusted_dividendrenamed toDividendDataRow.split_adjusted_dividendMarketDataDailyRowentity fields added for all split-adjusted and dividend-and-split-adjusted data setsFMP updated all endpoints to the new “stable” version
FMPclass and all related code usages now renamed toFinancialModelingPrep;Parameters file now uses
financial_modeling_prepinstead offmpas the provider nameRenamed most calculation feature functions for consistency
Updated Excel parameters file template to the new public API
Fixed#
CLI
update excelcommand not working on dev
Removed#
MarketDataDailyRow.adjusted_closefield, as it is now redundant
0.38 (2025-05-08)#
Added#
DataColumncomparison operators ==, !=, <, <=, >, >= for use in binary logic, filters, etc.DataColumn.boolean_andandDataColumn.boolean_ormethods for boolean comparisons between multiple BooleanArray DataColumnsMissing
DataColumnreverse arithmetic operation testsDataColumn._replace_array_mask_with_nonesprivate method to replace the mask of a DataColumn with None valuesDataColumnrelated exceptions
Changed#
DataColumndivisions involving decimal inputs now output as float64, as any decimal precision was lost in the division anyway and becomes innecessary
0.37 (2025-04-11)#
Added#
Created
conftest.pywithpytestfixtures to load example data from CSV files used in calculation tests.Helper methods in
helpers.pyto compute technical indicators, including Exponential Moving Average (EMA).Pytest coverage for
calculations.py, reaching 96%.
Changed#
Prefixed all functions in
calculations.pywithc_.Alphabetically ordered all functions in
calculations.pyand their corresponding tests.Replaced usage of
talibrary for technical indicators in almost all features, except forc_chaikin_money_flowandc_relative_strength_index_14d, which are still under development.Removed CSV-based test data fixtures previously defined directly in
calculations_test.py.
0.36 (2025-03-24)#
Changed#
Simplify dev installation procedure in README.md and Dockerfile
Fixed#
ExcelConfigurator: Data providers returning
validate_api_key()asNoneshould not raise an errorColumnBuilder: Extended Fundamental entity properties fail when full data period contains no fundamental data
0.35 (2025-03-13)#
Fixed#
DataColumnextended entity subclasses should now work with empty data on the first rows
0.34 (2025-03-04)#
Added#
DataProviderInterfacetests (incomplete)
Changed#
DataProviderInterface._find_first_date_before_start_date_in_descending_datesrenamed toDataProviderInterface._find_first_date_before_start_date, addeddescending_orderparameterDataProviderInterface._find_unordered_dates_in_descending_datesrenamed toDataProviderInterface._find_unordered_dates, addeddescending_orderparameter
0.33 (2025-02-20)#
Changed#
Feature and custom calculation function names should now always start with a
c_prefix.Improved some error texts’ legibility
Deprecated#
Feature and custom calculation function names without the
c_prefix will stop working in the public release version.
0.32 (2025-02-17)#
Changed#
Breaking:
entity_helper.fill_fieldsrenamed toentity_helper.convert_data_row_into_entity_fieldsentity_helper.convert_data_row_into_entity_fieldsnow skips the type conversion logic if the value is already the expected type.
0.31 (2025-02-13)#
Changed#
Can now subclass all fundamental statement row entities to add custom data columns
Fixed#
Regression: Debugger initialization was not being called
0.30 (2025-01-31)#
Changed#
Python 3.13 now officially supported
Can now subclass
MarketDataDailyRowto add custom market data columns
Fixed#
CLI
updateshould load the templates from the templates dir instead of the Python data dir when installed in editable modetemplates dir not found when installed in editable mode on Windows
0.29 (2025-01-25)#
Added#
kaxanuk.data_curator.modules.extension_handleraliased tokaxanuk.data_curator.extension_handlerfor loading external extension modules
Changed#
YahooFinance is now a separate package,
kaxanuk.data_curator_extensions.yahoo_financeunder KaxaNuk/Data-Curator-Extensions_Yahoo-Finance
Removed#
YahooFinance from this main package
0.28 (2024-12-19)#
Added#
Sphinx documentation generator with Readthedocs support
Creation of a .py extension for parsing the calculations file and classifying its content using the
..category::directive found in the docstringImplementation of a structure to support
.mdfiles located outside the docs folderDevelopment of a Sphinx custom template for the documentation.
CLI
--versioncommandkaxanuk.datacurator.__package_name__kaxanuk.datacurator.__package_title__
Changed#
__version__and__parameters_format_version__moved from__version__.pyto__init__.pyMigration of the documentation structure to the
docsfolder, replacing the deprecatedRead_the_Docsfolder.Parquet and CSV output handlers no longer take
data_features_subdirparameter, everything is saved to theOutputfolder
Fixed#
Regression: YahooFinance not correctly loaded by Excel configuration entry script
Editable install CLI
initandupdatenow work in any directoryCLI catch
OSErrorwhen there’s a file permissions issue
Removed#
__version__.pyfilekaxanuk.datacurator.versionData_and_Featuressubfolder
0.27 (2024-11-17)#
Added#
CLI update format ‘entry_script’ for updating just the entry script
validate_api_keyabstract (required) method toDataProviderInterface
Changed#
FinancialDataProviderInterfaceis nowDataProviderInterfaceonce moreExcelConfigurator.__init__API changed, data_providers now receive a typed dict withclassandapi_keyparamsData provider API keys now are per provider
Excel entry script now gets api keys from environment (loading from Config/.env if available)
FMPchanged_endpointsMappingProxyType for StrEnumFMP.validate_api_keymakes a request to get AAPL company informationUpdated parameters_datacurator file version
0.26 (2024-11-13)#
Added#
Datacolumn.__neg__Left a not implemented placeholder for
Datacolumn.__pos__for completeness
Changed#
DataColumn.all_equal()renamed toDataColumn.fully_equal()Updated usage instructions in README.md
Fixed#
DataColumn reflected operators for
+,-,*,/,//,%
0.25 (2024-11-08)#
Added#
Tests to replicate reflected arithmetic operation errors.
Missing typing on data_column methods.
CLI interface through
services.cliclick library for the CLI implementation
.envfile to templates (unused at the moment)__main__.pyfile to templates
Changed#
DataColumnarithmetic operation tests now grouped into classes.Docker images now install the library and use the CLI interface
templatesdir files moved intodata_curatorsubdir so that installed data files are also installed in a subdir of the data dirConfig templates are now in the
templates/data_curator/ConfigsubdirThe PyCharm debugger now only depends on the
KNDC_DEBUG_PORTenv variable for activation
Fixed#
GitHub workflow broke on pull request merges
Removed#
Configdir (gets created by the CLI interface now)Outputdir (gets created by the CLI interface now)__main__.py(gets created by the CLI interface now)
0.24 (2024-10-12)#
Changed#
FMP: use unadjustedVolume for
MarketDataDailyRow.volumeStandardized .gitignore based on the official GitHub one
Fixed#
Pin Dockerfile base image to python:3.12-slim, as Python 3.13 is now a thing and Pyarrow breaks on it
0.23 (2024-09-21)#
Added#
The GitHub Actions pipeline now builds and pushes the container image to GitHub Packages when pushing the dev branch or a main branch tag.
0.22 (2024-09-21)#
Changed#
DataColumn.__getitem__- simplified the logicFirst steps to build the full GitHub Packages publishing pipeline
Fixed#
FinancialDataProviderInterface._request_data- “AttributeError: ‘URLError’ object has no attribute ‘read’” when_load_ssl_contextthrows “ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host”FinancialDataProviderInterface._request_data- connection errors were still throwing up error trace after all connection attempts
0.21 (2024-08-04)#
Added#
tests:
ColumnBuilder._process_columns_with_available_dependenciesnow fully tested
Changed#
prepended
KNDC_to all env varsincreased
parameters_datacurator.xlsxversion to 0.21ExcelConfiguratornow receives the names of the api key env vars, and loads them itself, as otherwise the warnings can’t really know if the keys were loaded from env vars or notadded optional
logger_fileparameter todata_curator.main, to enable logging to a fileFundamental data entity error logs now include the date, where available
Fixed#
logger_levelparameter indata_curator.mainwas being ignored ifExcelConfiguratorhad been used beforehand, as it was not closing its own loggerExcelConfiguratornow changes its own level to the one in the Excel file, as soon as it can
0.20 (2024-07-28)#
Changed#
parameters.xlsx file renamed to parameters_datacurator.xlsx as a workaround for Excel’s mind-numbingly idiotic inability to have 2 open files with the same name as of 2024
MarketDataDailyRow: validatelow <= highMarketDataDailyRow: validatevolumeandvwapare not negativeFundamentalDataRowIncome: validateweighted_average_shares_outstandingandweighted_average_shares_outstanding_dilutedare not negativeColumnBuilder._process_columns_with_available_dependenciesis now class method, for easier unit testingpytest: added
data_column_debugger.dump_data_columns_to_csv()fixture to conftest.py for dumping DataColumns to a csv file for debugging testsmore unit tests
0.19 (2024-06-30)#
Added#
FinancialDataProviderInterface._build_url_with_ticker_path_and_query_paramsmethod for building standard URLscalculations.annualized_volatility_5d()test with csv fixture
Changed#
features.helpers.annualized_volatilityandfeatures.helpers.indexed_rolling_window_operationnow require kwargs as they are meant as user-facing functionsimproved
column_buildertyping and docstringsFinancialDataProviderInterface._request_datanow accepts aurl_buildercallable argument for building the URLs,_build_url_with_ticker_path_and_query_paramsby defaultmade
MarketDataDailyRow.adjusted_closenullablerefactored
DataColumn.equal()to use both absolute and relative thresholds when comparing withapproximate_floats=truebetter documentation and linting
## Fixed
* Fundamental Data Provider ‘none’ failed regression
* DataColumn.equal failed when presented 2 NullArray arguments
0.18 (2024-05-26)#
Added#
logger_formatparam toExcelConfigurator.__init__Tickervalue object for entitiesservices.validatormodule, with a validator for date pattern stringsdate pattern tests for dividend and split rows
package descriptions for each
__init__.py
Changed#
Started removing redundant docstring types from typehinted functions
providerspackage is nowdata-providersinternalspackage is nowservicesentities now use
Tickerinstead ofstrfor fieldtickerMoved all interface modules to their respective implementation packages
test coverage report now skips fully covered files
Fixed#
ExcelConfiguratormissing checksHandle FMP throwing error 404 when symbol data not found
entity_helper.detect_field_type_errorsbug that prevented validation of all fields of an entityentity_helperandvalidatorservices now fully tested, full library test coverage @ 58%
0.17 (2024-05-12)#
Added#
pdm.lock to .gitignore (don’t forget to remove on main branch!)
FinancialDataProviderInterface.init_config()method for running code before looping through each tickerDataProviderMissingKeyErrorYahooFinance missing interface implementation methods
__all__to all__init__.pyfiles, to be able to remove theF401linter ignoreConfiguratorInterface
Changed#
Data providers now use object instances instead of classes
Inject api keys into data providers at object initialization, removed them from individual method params
data_curator.main()now receives individual data provider objects instead of dict with possible choicesdata_curator.main()now receives output_handlers list, runs alldata_curator.main()now receives an intlogger_levelComplete refactor of
ExcelConfigurator, is now an implementation of the newConfiguratorInterfaceand has methods for outputting each dependency typePassedArgumentErrorException
Fixed#
SplitData error with Fundamental data provider set as None
data_curator.py linter G004 error
Removed#
Configurationentity fields related to providers, handlers and loggers
0.16 (2024-05-02)#
Added#
yahoo_finance market data - yahoo!
Changed#
increased parameters file version
0.15 (2024-04-28)#
Added#
split data
initial entity_helper tests
Changed#
MarketDataDailyRow.vwap now nullable
use entity_helper.detect_field_type_errors() for all downloaded data entity validation
entity_helper.fill_fields() now accepts null field correspondences
updated parameters.xlsx template to add new fields, column width adjustments
0.14 (2024-04-23)#
Added#
modules/data_column 100% effective test coverage!
dividend data - had to rework some internal machinery
entity_helper.detect_field_type_errors() for validating entity field types
Changed#
pdm run testnow includes coverage term-missing reportattribute_filler.py renamed to entity_helper.py
0.13 (2024-04-06)#
Added#
DataColumn.__mod__() actual implementation
DataColumn private members tests, class coverage at 90%
Changed#
Replaced generic exceptions by custom ones
Fixed#
Decimal precision out of range errors on DataColumn.__add__(), .__subtract__()
Decimal precision out of range error on DataColumn._mask_zeroes()
Multiple DataColumn private methods minor bugs
0.12 (2024-03-24)#
Changed#
CHANGELOG.md updated Keep a Changelog link to version 1.1.0
Fundamental data: missing cash flow or balance sheet now returns just the income statement data if available
Improved Api server error retries handler
Api server errors (after retries) now fully stop execution
Publicly exposed entity classes from kaxanuk.data_curator.entities namespace
Improved error handler for circular dependencies
Improved error handler for missing custom calculation functions
Improved error handler for Excel parameters file configuration errors
Removed redundant src/kaxanuk/py.typed file
Fixed#
“Decimal precision out of range” error when dividing decimal columns with too many decimals
0.11 (2024-03-17)#
Added#
pdm run install_dev
parquet output handler
py.typed files to declare the whole library is typehinted
Changed#
now loading api_keys from env vars, only using templates/parameters.xlsx as fallback
increased templates/parameters.xlsx version
fixed templates/parameters.xlsx start date validation
added all default output columns to templates/parameters.xlsx
improved internal documentation in templates/parameters.xlsx
Docker: don’t uninstall pdm on dev environment
renamed
pdm run test_with_coveragetopdm run test
Fixed#
support index tickers with ^ character
“Decimal precision out of range” error when multiplying decimal columns with too many decimals
ExcelConfigurator typecasting None values to ‘None’ string
mypy configuration and some typehint errors
0.10 (2024-03-10)#
Added#
coverage tests under pytest-cov
Changed#
modules/security_calculations.py is now features/calculations.py
moved security_calculations helper functions into features/helpers.py
moved modules/attribute_filler.py to internals/attribute_filler.py
moved modules/column_builder.py to internals/column_builder.py
exposed FinancialDataProviderInterface as kaxanuk.data_curator.interfaces.FinancialDataProviderInterface
exposed OutputHandlerInterface as kaxanuk.data_curator.interfaces.OutputHandlerInterface
refactored DataColumn tests
Fixed#
Circular references error when custom calculation function parameter columns are missing from the output column list
__main__.py now only injects src to sys.path if not loading as installed library
0.9 (2024-03-03)#
Added#
An actual public API for the library by means of
__all__and imports in__init__.pyfilesDataColumn methods, fully unit tested::
all_equal,concatenate,equalDataColumn property:
typeBasic url request retry functionality
Changed#
CsvOutputter is now CsvOutput
__main__.pynow uses the public library API in its imports
Fixed#
security_calculations._indexed_rolling_window_operation was broken, so last_twelve_month… functions returned wrong data
Improved internal documentation
Improved/simplified some type hints
Regression: Empty market data no longer terminates the whole process
Removed#
Numpy no longer a dependency
0.8 (2024-02-26)#
Added#
pytest pyarrow_helper fixture, for helper functions for testing PyArrow arrays
Changed#
No more Numpy! DataColumn, and thus security_calculations, now work on top of PyArrow!
DataColumn public API changed, but operator overloading works the same
security_calculations refactor as PyArrow simplifies many operations, allows easier use of pandas
security_calculations output columns are automatically wrapped in DataColumn
0.7 (2024-02-11)#
Added#
Add main() optional parameter logger_format, for configuring the logger
Changed#
Make entity attribute typing/casting errors more explicit
Remove revenue>=0 checks
Simplify entity attribute type validations
Change the default logger format for more readable logs
Allow floats as entity attribute type
Moved all general helper methods from FMP to FinancialDataProviderInterface, so any provider can use them
Fixed#
Require Configurator.start_date and end_date to be explicitly datetime.date
0.6 (2024-02-05)#
Fixed#
Add blank FundamentalData rows for ommited data (in case of ammendments, missing fundamentals, &c.)
0.5 (2024-02-04)#
Added#
Custom exceptions
New library dependency for semver version comparisons: packaging
Changed#
New parameters file template
Now versioning the parameters file formats, and checking them in ExcelConfigurator
Fundamental data provider can be now set independently of market data one, or even as disabled
Separate input and config handlers into their own folders
Rename the “quarter” period to “quarterly” in config
Fixed#
Missing fundamental data for a ticker will only omit that data, but keep the market data
0.4 (2024-02-03)#
Changed#
Replace numpy.array with custom DataColumn to remove “where” kwarg boilerplate code.
Inject custom calculations from entry script
Move templates outside src
0.3 (2024-01-07)#
Restructure src to implement under organization/project_name namespace.