Data Scientist, Professional Problem Solver, Applied Mathematician, Developer

Contact: marc@marcharper.net -- marc.harper@gmail.com

Summary

I am a data scientist with a background in applied mathematics, consulting, educational technology, bioinformatics, evolutionary biology, physics, and information theory, among others. For the past eight years I have built data-driven products, created novel statistical methods for extracting information and visualizing data, and presented detailed analyses to a huge variety of audiences including executives, customers, and academics.

I have an extensive background in programming in Python, C++, and other languages, proficiency with major machine learning packages such as scikit-learn and R, experience building websites with frameworks such as django, flask, sqlalchemy, jquery, and others.

Employment

Quantitative Analyst / Data Scientist

Google, Sept 2016 - Present

Data Science Instructor, Content Developer

General Assembly, March 2016 - August 2016

I'm taught a part time data science course for professionals with General Assembly in Santa Monica. The course covers a wide variety of data science topics (syllabus). I also developed and QA'd content for the full-time Data Science Immersive course.

Senior Data Scientist

Tribogenics, Inc., March 2016 - June 2016

ETL, construction of a data analytics database and dashboard, time-series analysis and modeling of x-ray emission spectra, various other types of data analysis such as A/B testing and analysis of customer field data, general ad-hoc problem-solving, experiment design.

Technology Consultant, Data Scientist

Covariant Consulting (owner) / Independent Consultant, 2008 - Feb 2016

Created ALEKS PPL, an artificial intelligence based mathematics placement system. Implemented placement programs at ~100 universities including technology integrations (SSO, data feeds, etc.), institutional policies, data analysis, marketing strategy, training, and conference presentations. Developed an analytics platform for placement data and assisted numerous schools in their research. [Publication, Publication]

Bioinformatics Researcher (Postdoctoral Scholar)

UCLA Institute of Genomics and Proteomics, 2009 - 2013 (4.5 years)

Research in bioinformatics, evolutionary game theory, machine learning, statistical inference, and biochemistry, among others. We created phenotype sequencing, a method of determining which genes are causal for a given phenotype using high-throughput sequencing. We also created a machine learning based strategy for the iterated prisoner's dilemma. See below for more publications.

Teaching and Research Assistant, Data Analyst (Graduate student)

University of Illinois, 2004 - 2009

Doctoral research in Evolutionary Game Theory. Taught many courses from basic math through Calculus III. Five appearances on the list of instructors rated outstanding (top 10%) by students at the University of Illinois. Nominated for multiple departmental and campus-wide teaching awards.

System Administrator, Researcher, Teaching Assistant

University of Florida, 2003 - 2004

Education

Skills

Code Samples Github

Publications (Refereed and Preprints)

Evolutionary Dynamics

  • An open reproducible framework for the study of the iterated prisoner's dilemma, with Vince Knight, Owen Campbell, Karol Langner, et al. Journal of Open Research Software (open access) [ArXiv preprint]. The Axelrod library is an open source Python package that allows for reproducible game theoretic research into the Iterated Prisoner's Dilemma (2016)
  • Stationary Stability for Evolutionary Dynamics in Finite Populations, with Dashiell Fryer. Entropy (open access) [ArXiv preprint]. We show that the maxima and minima of the Moran process satisfy an analog of evolutionary stability (incorporating mutation), generalizing the Lyapunov theory of the replicator equation to finite population Markov processes with mutation. Gallery of examples. More precisely, we show that the stationary distribution of the Moran process (and related processes) with mutation in finite populations contains information about the evolutionary stability of states of the underlying process. (2016)
  • Entropic Equilibria Selection of Stationary Extrema in Finite Populations, with Dashiell Fryer. [ArXiv preprint]. In this paper we use the stationary distribution and entropy rates of the Moran process with mutation to compare equilibria within a Markov process and across similar Markov process. Altering the strength of selection, mutation rate, or population size can change which equilibria is most likely under the Moran process with mutation. (2015)
  • The Art of War: Beyond Memory-one Strategies in Population Games, with Chris Lee and Dashiell Fryer. PLoS One [ArXiv preprint]. We present a highly-robust machine learning-based strategy for the prisoner's dilemma in population games that naturally forms coalitions that is typically able to invade any other opponent (more often than a neutral mutant) and is highly-resistant to invasion by other strategies. (2015)
  • Lyapunov Functions for Time-Scale Dynamics on Riemannian Geometries of the Simplex, with Dashiell Fryer. Dynamic Games and Applications (DGAA) preprint-pdf (Formerly titled "Stability of Evolutionary Dynamics on Time Scales", ArXiv preprint) We give a far-reaching Lyapunov theorem for incentive dynamics on time-scales for a large class of Riemannian geometries, with a wealth of examples. This work substantially generalizes the results in my 2011 Physica D paper "Escort Evolutionary Game Theory". Azimuth blog post (2014)
  • Entropy Rates of the Multidimensional Moran Processes and Generalizations [ArXiv preprint]. This is a followup to "The Inherent Randomness of Evolving Populations", generalizing to multidimensional populations. (2014)
  • Incentive Processes in Finite Populations, with Dashiell Fryer. [ArXiv preprint] We introduce the incentive process, a generalization of the Moran process to include alternate selection mechanisms, and give analytic fixation probabilities in some cases. (2013)
  • The Inherent Randomness of Evolving Populations, Physical Review E. [ArXiv preprint] Computations and theorems on the entropy rates of the Moran and Wright-Fisher processes with mutations. (2013)
  • Inferring Fitness in Finite Populations with Moran-like dynamics. [ArXiv preprint] Relative fitness can be inferred from trajectories of the Moran process using Bayesian inference. (2013)
  • Escort Evolutionary Game Theory, Physica D, Vol 240, Issue 18, pg 1411-1415 doi:10.1016/j.physd.2011.04.008 [ArXiv preprint] This paper explores the dynamics of generalized entropies and information divergences, simultaneously deriving Lyapunov functions for an infinite family of dynamics that includes the replicator and projection dynamics. (2011)
  • (Semi-Survey) Information Geometry and Evolutionary Game Theory. [ArXiv preprint] (2009)
  • The Replicator Equation as a Continuous Inference Dynamic. [ArXiv preprint] Replicator dynamics and Bayesian inference are closely related formally. Cosma Shalizi independently found a similar relationship, see Appendix A of his paper Dynamics of Bayesian Updating with Dependent Data and Misspecifed Models (2009)

Bioinformatics / Biochemistry / Molecular Biology

Educational Technology

Selected Research Presentations

  • Characterizing Finite Population Dynamics via Information Theory, Statistical Mechanics and Information in Evolution slides (Munich, July 2016)
  • Information Transport and Evolutionary Dynamics, NIMBioS, Entropy and Information in Biological Systems slides (April 2015)
  • Equilibrium Selection for Markov Processes via Random Trajectory Entropy with applications to Finite Population Biology, with Dashiell Fryer, Joint Math 2015 slides (January 2015)
  • Characterizations of Stationary Extrema with Applications to Finite Population Models, with Dashiell Fryer, Joint Math 2015 slides (January 2015)
  • A Powerful Long Memory Strategy for the Prisoner's Dilemma, with Dashiell Fryer and Chris Lee (11:30) , with Dashiell Fryer and Chris Lee, Joint Math 2015 slides (January 2015)
  • Stationary Stability for Evolutionary Dynamics in Finite Populations [abstract], with Dashiell Fryer, SIAM Conference on the Life Sciences (August 2014)
  • Quantifying the Relationships among Natural Selection, Mutation, and Stochastic Drift in Multidimensional Finite Populations [abstract], SIAM Conference on the Life Sciences (August 2014)
  • A Novel Strategy That Dominates Zero Determinant and Other Known Strategies in Multiplayer Evolutionary Games [abstract], with Chris Lee and Dashiell Fryer, SIAM Conference on the Life Sciences (August 2014)
  • Stationary Stability for Evolutionary Dynamics in Finite Populations [abstract], with Dashiell Fryer, SIAM Annual Meeting (July 2014)
  • Time Scale Calculus and Evolutionary Dynamics [abstract], with Dashiell Fryer, SIAM Annual Meeting (July 2014)
  • Quantifying the Relationships among Natural Selection, Mutation, and Stochastic Drift in Multidimensional Finite Populations [abstract], SIAM Annual Meeting (July 2014)
  • The Inherent Randomness of Evolving Populations [abstract], JMM (Jan 2014)
  • Evolutionary Stability in Finite Populations [abstract], with Dashiell Fryer, JMM (Jan 2014)
  • Stationary Stability in Finite Populations , UCLA Probability Theory Seminar slides [pdf] (Dec 2013)
  • Time-scale Lyapunov functions for Incentive Dynamics on Riemannian Geometries, with Dashiell Fryer, SIAM DS (May 2013)
  • Phenotype Sequencing: Identifying the Genes and Pathways that cause a Phenotype using EcoCyc, Conference on Predicting Cell Metabolism and Phenotypes slides [pdf] (March 2013)
  • Inferring Fitness in Finite, Variably-Sized, Evolving, and Dynamically-Structured Populations, Joint Mathematics Meeting slides [pdf] (Jan 2013)
  • Time-scale Lyapunov functions for Incentive Dynamics on Riemannian Geometries, with Dashiell Fryer, Joint Mathematics Meeting slides [pdf] (January 2013)
  • Quartet partition tests for O(N log N) tree construction, Bioinformatics Retreat, UCLA (February 2012)
  • Phenotype Sequencing, Joint Mathematics Meeting (January 2012)
  • Information Theory and Geometry in Evolutionary Dynamics, Info-Evo Seminar, UCLA (November 2011)
  • Phenotype Sequencing, IPAM Workshop, UCLA
  • (October 2011)
  • Information Gradients and Evolutionary Dynamics, Seminar at the University of Illinois (July 2010)

Selected Education Talks/Panels

  • Precalculus Redesign: The Influence of a Placement Program and the Power of a Name, with Alison Reddy, MAA Mathfest (August 2014)
  • Panel discussion on online education in mathematics at the Missouri MAA meeting, presentation of results from the placement program at the University of Illinois at a conference in Connecticut. (March 2014)
  • Identifying Concepts Critical for Success in Calculus at the University of Illinois [abstract], with Alison Ahlgren Reddy, JMM (Jan 2014)
  • Identifying Crucial Mathematics Concepts and Skills for Course Success, with Alison Reddy, AMATYC (Nov 2013)
  • The Impact of Arithmetic Skills on Success in Calculus II and III, with Alison Ahlgren Reddy, Joint Mathematics Meetings (Jan 2013)
  • Connecting student knowledge and course performance at the University of Illinois. Joint Mathematics Meetings (Jan 2012)
  • Identifying Crucial Concepts and Skills for Success in College Algebra through Calculus, with Alison Ahlgren Reddy. Mathfest (Aug 2011)
  • Connecting student knowledge and course performance at the University of Illinois, with Alison Ahlgren Reddy. Joint Mathematics Meetings (Jan 2011)
  • Readiness Assessment and Course Placement through Introductory Calculus, with Alison Ahlgren Reddy. Joint Mathematics Meetings (Jan 2010)