Adopting an architecture that meets specific user requirements during setup, you can guarantee optimal performance from your Amazon Redshift cluster. Let us take a look at some of the architectural choices that are available to manage workload and steer clear of outages.(more…)
In 2020, when a surging pandemic and safety protocols shuttered many offices, we were among the IT firms to switch entirely to remote work. We adapted to its rhythms almost instantly, logging into work like clockwork, collabing over Zoom and Meet, and diligently meeting deadlines and release dates. Access to internal systems and data was something to be ironed out. An in-house mobile application became the need of the hour. Building the iOS app was no sweat but the distribution path wasn’t so clear-cut at first.
I’ll soon explain how we sorted this out.(more…)
This post is for anyone who’s trying to implement Public Key Info Hash SSL Pinning in iOS using TrustKit. The process is very straightforward except when you goof up by missing a tiny detail. A lot of documentation is already available on this topic. I’m just bringing the whole process under one roof.(more…)
Claim adjudication, the process of determining the financial liability of a claim by the insurance company, is quite complex and time-consuming. Adjudication can be quick if the received claim is clear to the dot, in the sense that all the information is accurate and the claim is within the limits of the policy. But, as with all things in life, this is never the case.(more…)
In the world of big data analytics, PySpark, the Python API for Apache Spark, has a lot of traction because of its rapid development possibilities. Apart from Python, it provides high-level APIs in Java, Scala, and R. Despite the simplicity of the Python interface, creating a new PySpark project involves the execution of long commands. Take for example the command to create a new project:
$SPARK_HOME/bin/spark-submit \ --master local[*] \ --packages 'com.somesparkjar.dependency:1.0.0' \ --py-files packages.zip \ --files configs/etl_config.json \ jobs/etl_job.py
It is NOT the most convenient or intuitive method to create a simple file structure.
So is there an easy way to get started with PySpark?(more…)
A large variety of fraud patterns combined with insufficient data on fraud makes insurance fraud detection a very challenging problem. Many algorithms are available today to classify fraudulent and genuine claims. To understand the various classification algorithms applied in fraud detection, I did a comparison using vehicle insurance claims data.(more…)
Research shows that children primarily learn languages by observing patterns in the words they hear. Computer scientists are taking a similar approach to train computers to process human language.
Imagine that you are working on machine translation or a similar Natural Language Processing (NLP) problem. Can you process the corpus as a whole? No. You will have to break it into sentences first and then into words. This process of splitting input corpus into smaller subunits is known as tokenization. The resulting units are tokens. For instance, when paragraphs are split into sentences, each sentence is a token. This is a fairly straightforward process in English but not so in Malayalam (and some other Indic languages).(more…)