Lead / Senior Data Scientist
Job Description
We are looking for a senior data scientist in the PayU intelligence team who will be primarily responsible for modeling
complex problems, discovering insights, and identifying opportunities through the use of statistical, algorithmic, mining,
and visualization techniques. Your primary focus will be to propose innovative ways utilizing graph databases and analytics
to look at the problems by applying data mining techniques, doing statistical analysis, validating your findings using an
experimental and iterative approach, and building high-quality prediction systems integrated with our services. You will
need strong business understanding, analytical and problem-solving skills, and programming knowledge.
Responsibilities
Design experiments, test hypotheses, and build models utilizing the traditional datasets and graph data.
Apply advanced statistical and predictive modeling techniques to build, maintain, and improve on multiple real-time
decision systems.
Identify what data is available and relevant, including internal and external data sources, leveraging new data collection
processes such as geo-location or social media
Utilize patterns and variations in the volume, speed and other characteristics of data for predictive analysis.
Define the preprocessing or feature engineering to be done on a given dataset, data augmentation pipelines, training
models and tuning their hyperparameters, analyzing the errors of the model and designing strategies to overcome them
Selecting features, building and optimizing classifiers using machine learning techniques
Extending company’s data with third party sources of information when needed
Creating automated anomaly detection systems and constant tracking of its performance
Skills and Qualifications
Bachelors in mathematics, statistics or computer science or a related field; Masters or PHD degree preferred.
5+ years of relevant quantitative and qualitative research and analytics experience.
Extensive knowledge of statistical techniques.
Ability to come up with solutions to loosely defined business problems by leveraging pattern detection over potentially
large datasets.
Proficiency in statistical analysis, quantitative analytics, forecasting/predictive analytics, multivariate testing, and
optimization algorithms.
Proficient in deep learning (CNN, RNN, LSTM, attention models, etc.), machine learning (SVM, GLM, boosting, random
forest), graph models and reinforcement learning
Experience with open source tools for deep learning and machine learning technology such as Keras, tensorflow, pytorch,
scikit-learn, pandas, etc.
Strong programming skills (Hadoop MapReduce or other big data frameworks, Java, Python), statistical modeling (R,
Python, SAS), query languages such as SQL, Hive, Pig
Familiarity with basic principles of distributed computing and distributed databases.
Demonstrable ability to quickly understand new concepts - all the way down to the theorems - and to come out with
original solutions to mathematical issues.