Hello Everyone! I am a SAP HANA Senior Consultant working closely with Enterprise HANA and Business Objects Reporting Solutions.
Today I would like to share an approach which one might take to
identify patterns
in each text using HANA Database Artifacts.
Requirement:
Identify the Quarter mentioned in each Document Line Text and display KPIs against this derived quarter. This Document Line Text is maintained by the Business Users and it is difficult to determine in which format a Business User will maintain the Quarter in the Text.
Eg:
Ejemplos de formatos de trimestres en el texto de la línea del documento
Challenge:
Since the Quarter Format is not uniformly maintained in the text how will you identify the correct Quarter on each line?
Solution:
We will address this challenge by leveraging
Regular Expression Functions
in a HANA Table Function.
To understand the code you will need some basic understanding on the Syntax of Regular Expressions. You can refer the
‘Useful References’
section at the end of my blog to gain a quick understanding.
High Level Algorithm:
1. Look at the text maintained by the users and identify all possible patterns (manual step).
2. Create pattern matching templates via REGEX functions.
3. Match each pattern against the text and identify used pattern via REGEX functions.
4. Extract the used pattern and Transform to one common format (eg: 21Q1).
5. Display KPIs against this DERIVED_QTR (by joining the Table Function to a Calculation View).
Please Note:
The below code contains more REGEX functions than you might actually require for this scenario. The only reason I have added them is so that you can understand how to use them and hopefully this will help you in your particular scenario.
I have provided comments which will help you understand what each line of code is doing.
Table Function Code:
...
Output of Table Function:
Salida de la función de tabla
You can then join this Table Function with a Graphical Calculation View and display KPIs against the Derived Quarter.
Conclusion:
Pattern Matching can easily be done by using Regular Expressions in HANA. Keep in mind though that the performance of the Calculation View will degrade in direct proportion to the number of patterns to be matched and volume of data against which the pattern needs to be matched.
Hope this blog will help you when you face a similar situation.
Feel free to ask any questions. Thank You and have a great day!
Useful References:
Regular Expression Syntax Tutorial:
https://regexone.com/
Regular Expression Information:
https://en.wikipedia.org/wiki/Regular_expression
https://www.regular-expressions.info/
SAP Help Documentation:
https://help.sap.com/viewer/7c78579ce9b14a669c1f3295b0d8ca16/LATEST/en-US/a2f80e8ac8904c13959c69bfc3...