Finding the CORRECT MIME type of a file attached in pega
What is MIME (Multipurpose Internet Mail Extensions):
It is a two-part identifier for file formats and format contents transmitted on the Internet. It indicates the nature and format of a document or a file. It does not change when the extension of a file is changed.
A MIME type usually consists of two parts: a type and a subtype, separated by a slash. The type represents the general category into which the data type falls, such as video or text. The subtype identifies the exact kind of data of the specified type the MIME type represents
Here are some examples of MIME types:
- `text/plain`: Plain text
- `application/octet-stream`: Any kind of binary data
- `text/html`: HTML files
- `image/jpeg`: JPEG images
- `application/json`: JSON data
In Pega whenever a file is attached to a case its instance is created in Data-WorkAttach-File class.
Now, this class do contain the Mime type in a property called pyAttachMimeType. But it is not always correct, because it finds the mime type according to the file extension. If we change the extension of a pdf file to png it will start showing the mime type as image/png.
An external library can be used to find the correct MIME type. In this case we are using Apache Tika
Jar can be found here: Tika Jar . Version: 1.18 (So that it is compatible with pega)
Definitely connect with your LSA before JAR import and take a cold backup of system before import.
Create an activity that takes the the pzInsKey of the Data-WorkAttach-File of the file as the parameter.
Function:
Java code:
try{
String encodedAttachstream = DownloadPage.getProperty("pyAttachStream").toString();
String fileName = DownloadPage.getProperty("pxAttachName").toString();
byte[] decodedAttachstream = Base64Util.decodeToByteArray(encodedAttachstream);
Tika tika = new Tika();
String mimeType = tika.detect(TikaInputStream.get(decodedAttachstream));
oLog.error("MIME for file: " + fileName + " is "+ mimeType);
return mimeType;
}catch (Exception e){
oLog.error("Exception occurred in finding the MIME type: " + e);
}
return "ERROR";
Imports:
Output: