Increasing data volumes and rapid changes in everyday business require that analyses also adapt to the prevailing speed of the data. The Python programming language with its flexible, extensive libraries in the data science area and the high-performance in-memory database management system SAP HANA are an unbeatable combination for fast data analyses. However, the possibilities are slowed down by the traditional database connection. Data can be accessed in real time in HANA, but is only transferred gradually via the Hana Database Client (hdbcli) driver. In addition, the upload of data is complicated by the relational database schema, which must be defined in advance, as opposed to NoSQL database types, and the connection data is not infrequently included in clear form in the code.
By further developing the database connection between the SAP HANA and Python with state-of-the-art parallelization and user-friendly standardization steps of data download and upload, the development process gains speed and simplicity.
The NLY SAP HANA Python Connector is designed as a Python module to establish a high-performance, secure connection to the SAP HANA database via the Open Database Connectivity (ODBC) interface. The technologies PyArrow, Turbodbc and the secure user store (hdbuserstore) provide an up-to-date data connection especially for the data engineering and data science area:
Through parallelization, the data download with the HANA connector reaches up to 9 times the speed compared to the connection with the standard SAP driver. In the upload, even 10 times the speed can be achieved.
The access to the HANA database by means of the connector is optimally carried out via the authentication by means of the token of the HDBUSERSTORE. There, the connection data of the HANA are securely stored on the client side. An alternative login with username and password is possible, but not necessary.
Pandas DataFrames are often used for data manipulation in the context of artificial intelligence. With the HANA connector, data is not only loaded directly into a DataFrame, but also during an upload its data types are matched accordingly to the HANA data types and tables that do not yet exist are created automatically.
In terms of its functions, the connector is closely aligned with the everyday work of developers. Frequently used SQL statements, such as the output of all tables of a schema, all columns of a table and checking for the existence of a table are integrated into the cursor as easily accessible functions.
In addition to automatic data type conversion, the use of parallelization ensures a 10x faster upload.
In the field of artificial intelligence, the amount of analysis data required is often immense. A fast data download not only helps in the data exploration phase, but also in model generation as well as future prediction. Especially in the prototyping phase, the time gained can be used productively.
Flexible data manipulation of data is quickly realized in Python with the HANA Connector and executable with a speed advantage. Productive systems additionally benefit from the failover support provided by the HDBUSERSTORE. Efficient storage formats greatly reduce the amount of memory required for data retrieval.
The NextLytics Python SAP HANA connector is installed as a Python module. Setup of the hdbuserstore is also included in the documentation and a CLI tool simplifies setup and connection management.
NextLytics is always at your side as an experienced project partner. We help you to effectively solve your data problems from data integration to the use of machine learning models. Use the form below to ask your question and we will get back to you as soon as possible.
We look forward to hearing from you!