Posted in

Data Science/Machine Learning Vice President

Data Science/Machine Learning Vice President

CompanyAlixPartners
LocationUnited States
Salary$120000 – $220000
TypeFull-Time
DegreesBachelor’s
Experience LevelSenior, Expert or higher

Requirements

  • Bachelor’s degree with concentration in Computer Science, Engineering or another quantitative field
  • 5+ years of applicable professional experience
  • Data-oriented personality
  • Ability to synthesize the requests received from team members at client sites
  • Desire to actively engage in geographically dispersed teams
  • Capability to be a creative, innovative problem solver – but using simple ideas
  • Knowledge of any of the following languages: Python, JavaScript, C#, or similar plus familiarity with at least one ETL tool (such as Alteryx, KNIME, SSIS, Pentaho, or DataStage)
  • Motivated to discover and learn new analytical techniques and software tools to improve the quality of our work
  • Strong verbal and written communication skills in English. Proficiency in other languages is a plus
  • Ability and willingness to work long hours and travel if necessary, to meet client demands
  • Ability to work full-time in an office and remote environment; physically able to sit/stand at a computer and work in front of a computer screen for significant portions of the workday
  • Willingness to work outside of normal U.S. business hours, and in particular as unique projects/needs arise
  • Must become familiar with, and promote and abide by, our Core Values as defined by the AlixPartners’ Code of Conduct and foster an inclusive environment with people at all levels of an organization

Responsibilities

  • Create ETL workflows, scripts, statistical models, and visualizations while taking responsibility for the design, build, test, execution, and support of the data migration, cleansing, wrangling, etc.
  • Selecting features, building and optimizing classifiers using machine learning techniques
  • Execute machine learning projects using state-of-the-art methods
  • Extending company’s data with third party sources of information when needed
  • Creating automated anomaly detection systems and constant tracking of its performance
  • Experience with common data science toolkits, such as Python, PySpark, R. Excellence in at least one of these is highly desirable
  • Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc.
  • Collect data from a wide variety of corporate databases, including various SQL databases -Microsoft, Redshift, Teradata, Oracle, Netezza, etc.-, Access, Excel, plain or formatted text files, OLAP cubes –Microsoft, Oracle-, and no-SQL databases
  • Parse data out of poorly structured XML and invalid HTML documents
  • Use regular expressions to extract information from un-structured text documents
  • Deal with missing data through multiple-imputation or the use of advanced models
  • Automate boring tasks with scripts
  • Build effective, reliable, and robust ETL processes that govern the data ingestion flow
  • Design database models, consistent table structures, and advanced dimensional schemas that carry out data quality and consistency standards
  • Apply modeling approaches, business intelligence patterns, and data management techniques
  • Understanding of cloud architectures. Some knowledge in Azure, AWS or GCP is desired
  • Demonstrate advanced SQL skills, such as CTEs and window functions, to work with extensive amounts of data at various aggregation levels
  • Review and analyze legacy code/scripts to understand data processing logic and business rules
  • Ability to apply statistical learning languages to build predictive models that enrich, expand, and allow deeper understanding of data analyses and solutions
  • Distributed systems knowledge, specially of HDFS and the Hadoop ecosystem
  • Use interactive data visualization tools, such as Tableau and Power BI to present results in a compelling manner
  • Ability to tell a convincing story to C-level executives using visual charts and dashboards
  • Present complicated technical findings to a non-technical audience

Preferred Qualifications

  • Excellence in at least one of the common data science toolkits, such as Python, PySpark, R is highly desirable
  • Some knowledge in Azure, AWS or GCP is desired