Avalados por :

Identifying Patterns in Text using HANA Database Artifacts: A Comprehensive Approach

  • Creado 01/03/2024
  • Modificado 01/03/2024
  • 11 Vistas
0
Cargando...
Hello Everyone! I am a SAP HANA Senior Consultant working closely with Enterprise HANA and Business Objects Reporting Solutions.
Today I would like to share an approach which one might take to identify patterns in each text using HANA Database Artifacts.

Requirement: Identify the Quarter mentioned in each Document Line Text and display KPIs against this derived quarter. This Document Line Text is maintained by the Business Users and it is difficult to determine in which format a Business User will maintain the Quarter in the Text.

Eg:


Ejemplos de formatos de trimestres en el texto de la línea del documento




Challenge:
Since the Quarter Format is not uniformly maintained in the text how will you identify the correct Quarter on each line?

Solution:
We will address this challenge by leveraging Regular Expression Functions in a HANA Table Function.
To understand the code you will need some basic understanding on the Syntax of Regular Expressions. You can refer the ‘Useful References’ section at the end of my blog to gain a quick understanding.

High Level Algorithm:
1. Look at the text maintained by the users and identify all possible patterns (manual step).
2. Create pattern matching templates via REGEX functions.
3. Match each pattern against the text and identify used pattern via REGEX functions.
4. Extract the used pattern and Transform to one common format (eg: 21Q1).
5. Display KPIs against this DERIVED_QTR (by joining the Table Function to a Calculation View).

Please Note:
The below code contains more REGEX functions than you might actually require for this scenario. The only reason I have added them is so that you can understand how to use them and hopefully this will help you in your particular scenario.

I have provided comments which will help you understand what each line of code is doing.

Table Function Code:
...



Output of Table Function:


Salida de la función de tabla


You can then join this Table Function with a Graphical Calculation View and display KPIs against the Derived Quarter.

Conclusion:
Pattern Matching can easily be done by using Regular Expressions in HANA. Keep in mind though that the performance of the Calculation View will degrade in direct proportion to the number of patterns to be matched and volume of data against which the pattern needs to be matched.

Hope this blog will help you when you face a similar situation.
Feel free to ask any questions. Thank You and have a great day!



Useful References:
Regular Expression Syntax Tutorial: https://regexone.com/
Regular Expression Information:
https://en.wikipedia.org/wiki/Regular_expression
https://www.regular-expressions.info/
SAP Help Documentation:
https://help.sap.com/viewer/7c78579ce9b14a669c1f3295b0d8ca16/LATEST/en-US/a2f80e8ac8904c13959c69bfc3...
Pedro Pascal
Se unió el 07/03/2018
Pinterest
Telegram
Linkedin
Whatsapp

Sin respuestas

No hay respuestas para mostrar No hay respuestas para mostrar Se el primero en responder

contacto@primeinstitute.com

(+51) 1641 9379
(+57) 1489 6964

© 2024 Copyright. Todos los derechos reservados.

Desarrollado por Prime Institute

¡Hola! Soy Diana, asesora académica de Prime Institute, indícame en que curso estas interesado, saludos!
Hola ¿Puedo ayudarte?