Question
Sopra Steria Group
FR
Last activity: 6 Aug 2019 2:28 EDT
Cost performance of unoptimized property vs optimized with DataType modification
Hello dear Community,
I'm using Pega 7.3/7.4 and I have a question considering property stored in DataBase.
the context : I need to get a property through a report. This property references the base 64 code to display an pdf and is natively unoptimized. The thing is that this property can contain a very long string (like a length of 270,000 characters). But if I optimize this property, the DataType in DataBase will be in VARCHAR(32).
So here are my few questions :
- Is it possible to optimize a property and then change its Datatype to something able to contain all the information ?
- If it is possible, does changing the datatype deoptimize my property ?
- Finally, what what will be more expensive in term of perfomance : an unoptimized property in the BLOB or an optimized property but with a modifier DataType ?
Best regards,
Kevin.
***Edited by Moderator Marissa to update platform capability tags****
-
Like (0)
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
Accepted Solution
Ernst & Young LLP
MT
Couple of points:
1. Never expose .pyAttachStream attribute: .pyAttachStream property is supposed to hold the attachment content (in base64 encoding). Once you attach a file to a case, PRPC converts attachment content into string (Base64 enconded) and store that in pyAttachStream property. As attached file size can easily be in MBs, so you can imagine what the max length of this property can be (certainly not 256 or 1024 char :-) ).
The reason in OOTB pc_data_workattach table this property is part of BLOB, as BLOB column can support up to 2 GB content.
Couple of points:
1. Never expose .pyAttachStream attribute: .pyAttachStream property is supposed to hold the attachment content (in base64 encoding). Once you attach a file to a case, PRPC converts attachment content into string (Base64 enconded) and store that in pyAttachStream property. As attached file size can easily be in MBs, so you can imagine what the max length of this property can be (certainly not 256 or 1024 char :-) ).
The reason in OOTB pc_data_workattach table this property is part of BLOB, as BLOB column can support up to 2 GB content.
Column Name | Datatype | Size |
---|---|---|
pxcommitdatetime | TIMESTAMP | 29 |
pxsavedatetime | TIMESTAMP | 29 |
pxcreatedatetime | TIMESTAMP | 29 |
pxcreateopname | VARCHAR | 128 |
pxcreateoperator | VARCHAR | 128 |
pxcreatesystemid | VARCHAR | 32 |
pxinsname | VARCHAR | 128 |
pxobjclass | VARCHAR | 96 |
pxupdatedatetime | TIMESTAMP | 29 |
pxupdateopname | VARCHAR | 128 |
pxupdateoperator | VARCHAR | 128 |
pxupdatesystemid | VARCHAR | 32 |
pylabel | VARCHAR | 64 |
pxattachkey | TIMESTAMP | 29 |
pxrefobjectkey | VARCHAR | 255 |
pxattachname | VARCHAR | 255 |
pzinskey | VARCHAR | 255 |
pzpvstream | BINARY | 2147483647 |
Given that you have exposed this property as a separate column, your report def is (in all possibility) returning only a part of attachment content as no way you can expose that much information (varchar 2 in oracle support upto 4000 bytes). You can verify that by looking at the content/length of that property on clipboard page of report result and on corresponding records instance (open that instance of Data-WorkAttach-File instance separately - using Obj-Open).
In other words, your design to expose that property as a column to get the attachment content will not work as PRPC is not retrieving the full content.
2. You can expose that property as CLOB column, as another member suggested, and also do this change in DSS before you can include that attribute in select clause of your report def. While that might work (Honest, I never tried this kind of setup), but your current design would be too much resource-intensive which will impact both your DB server and network traffic between PRPC server and DB server.
Set the below DSS to true.
Owning ruleset: Pega-RULESPurpose: reporting/retrieveFullClobContent
3. If possible, do NOT expose that property, rather switch to a design that would not require that attribute to be exposed.
Ernst & Young LLP
MT
To avoid significant impact on performance, I would suggest not to retrieve that much information through report. As you haven't explained your use case and you want to use report def to retrieve data, so I assume you would be showing all such records on a tabular format (grid, repeating section). I would suggest to use Report def to retrieve only the pzInsKey of the instance that holds that property, and render it as URL on UI. On click event on that URL, open the instance using Obj-Open and retrieve that property and render its content as per your requirement.
This on-demand retrieval of that attribute (actually the instance that holds it) would reduce network traffic and performance of report def.
Sopra Steria Group
FR
Hello, thank you for your detailled response.
You assumed right, sorry for not being accurate enough. Your idea of opening through an URL is a great option. Unfortunately, this part of the application my team is building is devlopped by a third party, thus we can't do anything about it but we will talk to them about that. It will save us a lot of time and performance.
In fine, do you have any ideas about my original questions like :
- is an optimized property is still considered as optimized if I modified its Datatype ?
- if it does, let's assume (hypothetically) that my data contain information like 300 characters, should I optimized then change its DataType or letting it in the BLOB ?
Best regards,
Kévin.
PEG
GB
I assume you don't (or rather can't) search based on the PDF, and just want to retrieve it, right?
Also what's the size of the BLOB compared with the size of the PDF? If this BLOB is mostly the PDF I'm wandering what the benefit would be optimizing this property (also there would be data storage overhead for doing this).
I don't recall which version we support CLOB data types. But depending on the answers to the above it's possible this may not be needed.
Sopra Steria Group
FR
Hello,
Yes, the PDF is attached to a Case and I want to retrieve the pdf (then display it based on an Action set but we will see this later.)
To retrieve the PDF,in the report we made a join between the table of the Case and the one with the attachments. And it's in the report that we have a warning saying the property is not optimized : The warning is the reason why we wanted to update it first.
To be honest I don't know where I can find the size of the BLOP, I'll gather some informations.
Thank you.
Ernst & Young LLP
MT
Can you post the name of the property in question (the same property for that PRPC raised that warning message)?
Sopra Steria Group
FR
Sure, the property is .pyAttachStream, from class "Data-WorkAttach-File". But the report containing the join is in Class "Link-attachment".
Accepted Solution
Ernst & Young LLP
MT
Couple of points:
1. Never expose .pyAttachStream attribute: .pyAttachStream property is supposed to hold the attachment content (in base64 encoding). Once you attach a file to a case, PRPC converts attachment content into string (Base64 enconded) and store that in pyAttachStream property. As attached file size can easily be in MBs, so you can imagine what the max length of this property can be (certainly not 256 or 1024 char :-) ).
The reason in OOTB pc_data_workattach table this property is part of BLOB, as BLOB column can support up to 2 GB content.
Couple of points:
1. Never expose .pyAttachStream attribute: .pyAttachStream property is supposed to hold the attachment content (in base64 encoding). Once you attach a file to a case, PRPC converts attachment content into string (Base64 enconded) and store that in pyAttachStream property. As attached file size can easily be in MBs, so you can imagine what the max length of this property can be (certainly not 256 or 1024 char :-) ).
The reason in OOTB pc_data_workattach table this property is part of BLOB, as BLOB column can support up to 2 GB content.
Column Name | Datatype | Size |
---|---|---|
pxcommitdatetime | TIMESTAMP | 29 |
pxsavedatetime | TIMESTAMP | 29 |
pxcreatedatetime | TIMESTAMP | 29 |
pxcreateopname | VARCHAR | 128 |
pxcreateoperator | VARCHAR | 128 |
pxcreatesystemid | VARCHAR | 32 |
pxinsname | VARCHAR | 128 |
pxobjclass | VARCHAR | 96 |
pxupdatedatetime | TIMESTAMP | 29 |
pxupdateopname | VARCHAR | 128 |
pxupdateoperator | VARCHAR | 128 |
pxupdatesystemid | VARCHAR | 32 |
pylabel | VARCHAR | 64 |
pxattachkey | TIMESTAMP | 29 |
pxrefobjectkey | VARCHAR | 255 |
pxattachname | VARCHAR | 255 |
pzinskey | VARCHAR | 255 |
pzpvstream | BINARY | 2147483647 |
Given that you have exposed this property as a separate column, your report def is (in all possibility) returning only a part of attachment content as no way you can expose that much information (varchar 2 in oracle support upto 4000 bytes). You can verify that by looking at the content/length of that property on clipboard page of report result and on corresponding records instance (open that instance of Data-WorkAttach-File instance separately - using Obj-Open).
In other words, your design to expose that property as a column to get the attachment content will not work as PRPC is not retrieving the full content.
2. You can expose that property as CLOB column, as another member suggested, and also do this change in DSS before you can include that attribute in select clause of your report def. While that might work (Honest, I never tried this kind of setup), but your current design would be too much resource-intensive which will impact both your DB server and network traffic between PRPC server and DB server.
Set the below DSS to true.
Owning ruleset: Pega-RULESPurpose: reporting/retrieveFullClobContent
3. If possible, do NOT expose that property, rather switch to a design that would not require that attribute to be exposed.
Sopra Steria Group
FR
Thank you for you really complete answer. Now I understand why a property should be optimized or not.