IDentrix: Data Scientist
Seeking a passionate Data Scientist with a proven record of building data driven solutions, who is interested in data mining and modeling specialized large and connected datasets.
IDentrix (www.identrix.com), is disrupting the world of risk management and insider threat with a first of its kind continuous risk management platform. This opening offers a unique opportunity with a financially sound spin-off providing competitive compensation, growth opportunities, and an entrepreneurial drive to innovate.
We are currently seeking a passionate Data Scientist with a proven record of building data driven solutions, who is interested in data mining and modeling specialized large and connected datasets. We are looking for a creative individual who can path-find solutions to complex business problems, develop prototypes, and participate in transition to working product. This position is located in Bethesda, MD.
The position requires expertise in the data life cycle, from data collection to analysis and result interpretation and presentation, large-scale computing, statistical modeling, applied machine learning, data mining or natural language processing, and data visualization.
- Conceptualize and demonstrate effectiveness of data-driven prototypes in collaboration with other engineers and scientists Develop data-driven insights to improve current products, anticipate future business needs and identify opportunities for complex analyses
- Design, implement and validate algorithms on real world problems specific to IDentrix using tools like Mongo / Map-Reduce, R, Talend, Hadoop, Spark, Elastic Search, etc
- Collaborate with technologists and business stakeholders to transition your findings into products
- Keep an active presence and stay up to date on the latest technologies by attending local meet ups or academic conferences.
- Review existing data sources to see how correlated they are, see how reliable (missing fields, etc)
- Identify data gaps and potential public and commercially available data sources (directly available or derivable)
- Evaluate and architect tools and algorithms to improve normalization, human readability, relevance, and discovery of data across multiple disparate data sources
- Establish and distinguish explicit and implicit relationships across individual data points over time. Evaluate tools, algorithms, data sets, and prototype correlations using selected tools
- Analyze current ranking analytics / algorithms and data inputs, correlations, and outputs and tune it, enhance it, and improve outputs, or propose alternative
- Evaluate enhanced algorithm and tooling options for (i.e. non-linear classifiers, adaptive scoring based on statistical distribution, time series analysis to determine how identified features change over time and are correlated over time)
- Evaluate opportunities to for enhanced correlation and forecasting through clustering and pattern recognition
Qualifications and Experience:
- Master's degree or PhD or equivalent experience in a relevant technical field, such as computer science, machine learning, data mining, natural language processing
- Relevant industry experience with success in solving analytical problems using quantitative approaches
- Solid programming skills (e.g. Java, R, Scala), experience with scripting languages (e.g. Python, Ruby, Perl)
- Comfort with manipulating and analyzing complex, data from sources of varying quality, high-dimensionality
- Significant knowledge of statistical data analysis and experiment design using R, Matlab, etc.
- Results driven with the ability to prioritize and focus on ideas and features that will have significant, measurable impact as defined by your interaction with stakeholders
- Ability to set and meet project objectives and milestones
- Good oral and written communication skills, and the ability to provide technical leadership
- Working knowledge of the following:
- NoSQL databases (Mongo) and Map-Reduce, Relational databases and BI techniques (required)
- Data or text mining, applied machine learning, or natural language processing at scale (required)
- Visualization tools (required)
- Distributed computing platforms, such as Hadoop (Hive, HBase, Pig), Spark, GraphLab (highly desired)
- Distributed search engines, such as Elastic Search or Solr/Lucene (desired)
Location: Bethesda, MD
Please submit your resume to email@example.com
For more information, go to www.identrix.com
IDentrix is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.