May 2017
# Marc Harper, Ph.D.

Contact: marc@marcharper.net -- marc.harper@gmail.com

I am a data scientist with a background in applied mathematics, consulting, educational technology, bioinformatics, evolutionary biology, physics, and information theory, among others. For the past eight years I have built data-driven products, created novel statistical methods for extracting information and visualizing data, and presented detailed analyses to a huge variety of audiences including executives, customers, and academics.

I have an extensive background in programming in Python, C++, and other languages, proficiency with major machine learning packages such as scikit-learn and R, experience building websites with frameworks such as django, flask, sqlalchemy, jquery, and others.

I'm taught a part time data science course for professionals with General Assembly in Santa Monica. The course covers a wide variety of data science topics (syllabus). I also developed and QA'd content for the full-time Data Science Immersive course.

ETL, construction of a data analytics database and dashboard, time-series analysis and modeling of x-ray emission spectra, various other types of data analysis such as A/B testing and analysis of customer field data, general ad-hoc problem-solving, experiment design.

Created ALEKS PPL, an artificial intelligence based mathematics placement system. Implemented placement programs at ~100 universities including technology integrations (SSO, data feeds, etc.), institutional policies, data analysis, marketing strategy, training, and conference presentations. Developed an analytics platform for placement data and assisted numerous schools in their research. [Publication, Publication]

Research in bioinformatics, evolutionary game theory, machine learning, statistical inference, and biochemistry, among others. We created phenotype sequencing, a method of determining which genes are causal for a given phenotype using high-throughput sequencing. We also created a machine learning based strategy for the iterated prisoner's dilemma. See below for more publications.

Doctoral research in Evolutionary Game Theory. Taught many courses from basic math through Calculus III. Five appearances on the list of instructors rated outstanding (top 10%) by students at the University of Illinois. Nominated for multiple departmental and campus-wide teaching awards.

- Ph.D. Mathematics / Mathematical Biology, University of Illinois Urbana-Champaign (2009)
- M.S. Mathematics, University of Illinois Urbana-Champaign (2006)
- B.S. Physics, B.S. Mathematics, University of Florida (2004)
- Coursera: Completed courses on Machine Learning and Natural Language Programming, 8 courses on data science

- General Skills: Data Analysis, Data Visualization, Machine Learning and Statistical Inference, Reinforcement Learning, Bioinformatics, Scientific and Mathematical Computation, IT Consulting, Security and Encryption, Project Management, Educational Technology
- Programming Languages: [Expert] Python [Experienced] C++, Go [Previous Use] Java, Javascript, Perl, Bash, Haskell
- Machine learning Software: R, the Python ecosystem including Scikit-learn, Statsmodels, Pandas, Jupyter, Matplotlib, Numpy, Tensorflow
- Frameworks: Django, JQuery, Jinja2, SQLAlchemy | CSS, SQL, HTML, etc.
- Software: Git, LaTeX, Various Office Suites, and countless others. Long time Linux user [10+ years].

- Axelrod: I am one of three maintainers of a library that runs and analyzes iterated prisoner's dilemma tournaments. See also the many visualizations available and various subrepositories, including the Axelrod-dojo for training new machine learning based strategies.
- python-ternary: A python library for ternary plots and ternary heatmaps using matplotlib
- stationary: Python library for exact and approximate computations of stationary distributions of multidimensional Markov processes in evolutionary dynamics and their entropy rates. Can compute approximate stationary distributions for generic Markov processes and exact distributions for reversible processes on the simplex. Includes optional C++ computation and SVG rendering for large populations (millions of edges).
- mpsim: Multi-threaded python library for generating many sample trajectories for Markov processes specified by a transition graph

*An open reproducible framework for the study of the iterated prisoner's dilemma*, with Vince Knight, Owen Campbell, Karol Langner, et al. Journal of Open Research Software (open access) [ArXiv preprint]. The Axelrod library is an open source Python package that allows for reproducible game theoretic research into the Iterated Prisoner's Dilemma (2016)-
*Stationary Stability for Evolutionary Dynamics in Finite Populations*, with Dashiell Fryer. Entropy (open access) [ArXiv preprint]. We show that the maxima and minima of the Moran process satisfy an analog of evolutionary stability (incorporating mutation), generalizing the Lyapunov theory of the replicator equation to finite population Markov processes with mutation. Gallery of examples. More precisely, we show that the stationary distribution of the Moran process (and related processes) with mutation in finite populations contains information about the evolutionary stability of states of the underlying process. (2016) *Entropic Equilibria Selection of Stationary Extrema in Finite Populations*, with Dashiell Fryer. [ArXiv preprint]. In this paper we use the stationary distribution and entropy rates of the Moran process with mutation to compare equilibria within a Markov process and across similar Markov process. Altering the strength of selection, mutation rate, or population size can change which equilibria is most likely under the Moran process with mutation. (2015)*The Art of War: Beyond Memory-one Strategies in Population Games*, with Chris Lee and Dashiell Fryer. PLoS One [ArXiv preprint]. We present a highly-robust machine learning-based strategy for the prisoner's dilemma in population games that naturally forms coalitions that is typically able to invade any other opponent (more often than a neutral mutant) and is highly-resistant to invasion by other strategies. (2015)*Lyapunov Functions for Time-Scale Dynamics on Riemannian Geometries of the Simplex*, with Dashiell Fryer. Dynamic Games and Applications (DGAA) preprint-pdf (Formerly titled "Stability of Evolutionary Dynamics on Time Scales", ArXiv preprint) We give a far-reaching Lyapunov theorem for incentive dynamics on time-scales for a large class of Riemannian geometries, with a wealth of examples. This work substantially generalizes the results in my 2011 Physica D paper "Escort Evolutionary Game Theory". Azimuth blog post (2014)*Entropy Rates of the Multidimensional Moran Processes and Generalizations*[ArXiv preprint]. This is a followup to "The Inherent Randomness of Evolving Populations", generalizing to multidimensional populations. (2014)*Incentive Processes in Finite Populations*, with Dashiell Fryer. [ArXiv preprint] We introduce the incentive process, a generalization of the Moran process to include alternate selection mechanisms, and give analytic fixation probabilities in some cases. (2013)*The Inherent Randomness of Evolving Populations*, Physical Review E. [ArXiv preprint] Computations and theorems on the entropy rates of the Moran and Wright-Fisher processes with mutations. (2013)*Inferring Fitness in Finite Populations with Moran-like dynamics*. [ArXiv preprint] Relative fitness can be inferred from trajectories of the Moran process using Bayesian inference. (2013)*Escort Evolutionary Game Theory*, Physica D, Vol 240, Issue 18, pg 1411-1415 doi:10.1016/j.physd.2011.04.008 [ArXiv preprint] This paper explores the dynamics of generalized entropies and information divergences, simultaneously deriving Lyapunov functions for an infinite family of dynamics that includes the replicator and projection dynamics. (2011)- (Semi-Survey)
*Information Geometry and Evolutionary Game Theory*. [ArXiv preprint] (2009) *The Replicator Equation as a Continuous Inference Dynamic*. [ArXiv preprint] Replicator dynamics and Bayesian inference are closely related formally. Cosma Shalizi independently found a similar relationship, see Appendix A of his paper Dynamics of Bayesian Updating with Dependent Data and Misspecifed Models (2009)

*Comprehensive Detection of Genes Causing a Phenotype using Phenotype Sequencing and Pathway Analysis*, with Luisa Gronenberg, James Liao, and Chris Lee. PLoS One [ArXiv preprint] We enhanced the phenotype sequencing method with gene pathway and functional gene association databases for a large increase in detection power. (2014)*Phenotype Sequencing: High-Throughput Discovery of the Genetic Causes of a Phenotype*, with Chris Lee. Microbe Magazine Feature (2013)*Genome-wide Analysis of Mutagenesis Bias and Context Sensitivity of N-methyl-N'-nitro-N-nitrosoguanidine (NTG)*, with Chris Lee. Mutation Research, Volume 731, Issues 1-2, 1 March 2012, Pages 64-67 http://dx.doi.org/10.1016/j.mrfmmm.2011.10.011 citations (2012)*Phenotype sequencing: identifying the genes that cause a phenotype directly from pooled sequencing of independent mutants*with Zugen Chen, Tracy Toy, Lara Machado, Stan Nelson, James Liao, and Chris Lee. PLoS ONE. Phenoseq at github Documentation for Phenoseq code citations (2011)

*Detecting Concepts Crucial for Success in Mathematics Courses from Knowledge State-based Placement Data*, with Alison Ahlgren Reddy. [ArXiv preprint] An in-depth analysis of topic-level placement data from the University of Illinois. (2013)*Mathematics Placement at the University of Illinois*, with Alison Ahlgren Reddy. PRIMUS, Volume 23, Issue 8, pages 683-702 post-print pdf An overview of the data from the successful placement program from the University of Illinois. (2013)- Contributed Chapter,
*ALEKS-based Placement at the University of Illinois*, in*Knowledge Spaces: Applications in Education*, with Alison Ahlgren Reddy (2013) *Assessment and Placement through Calculus I at the University of Illinois*, with Alison Ahlgren Reddy. Notices of the AMS. (2011)

*Characterizing Finite Population Dynamics via Information Theory*, Statistical Mechanics and Information in Evolution slides (Munich, July 2016)*Information Transport and Evolutionary Dynamics*, NIMBioS, Entropy and Information in Biological Systems slides (April 2015)*Equilibrium Selection for Markov Processes via Random Trajectory Entropy with applications to Finite Population Biology*, with Dashiell Fryer, Joint Math 2015 slides (January 2015)*Characterizations of Stationary Extrema with Applications to Finite Population Models*, with Dashiell Fryer, Joint Math 2015 slides (January 2015)*A Powerful Long Memory Strategy for the Prisoner's Dilemma, with Dashiell Fryer and Chris Lee (11:30)*, with Dashiell Fryer and Chris Lee, Joint Math 2015 slides (January 2015)*Stationary Stability for Evolutionary Dynamics in Finite Populations*[abstract], with Dashiell Fryer, SIAM Conference on the Life Sciences (August 2014)*Quantifying the Relationships among Natural Selection, Mutation, and Stochastic Drift in Multidimensional Finite Populations*[abstract], SIAM Conference on the Life Sciences (August 2014)*A Novel Strategy That Dominates Zero Determinant and Other Known Strategies in Multiplayer Evolutionary Games*[abstract], with Chris Lee and Dashiell Fryer, SIAM Conference on the Life Sciences (August 2014)*Stationary Stability for Evolutionary Dynamics in Finite Populations*[abstract], with Dashiell Fryer, SIAM Annual Meeting (July 2014)*Time Scale Calculus and Evolutionary Dynamics*[abstract], with Dashiell Fryer, SIAM Annual Meeting (July 2014)*Quantifying the Relationships among Natural Selection, Mutation, and Stochastic Drift in Multidimensional Finite Populations*[abstract], SIAM Annual Meeting (July 2014)*The Inherent Randomness of Evolving Populations*[abstract], JMM (Jan 2014)*Evolutionary Stability in Finite Populations*[abstract], with Dashiell Fryer, JMM (Jan 2014)*Stationary Stability in Finite Populations*, UCLA Probability Theory Seminar slides [pdf] (Dec 2013)*Time-scale Lyapunov functions for Incentive Dynamics on Riemannian Geometries*, with Dashiell Fryer, SIAM DS (May 2013)*Phenotype Sequencing: Identifying the Genes and Pathways that cause a Phenotype using EcoCyc*, Conference on Predicting Cell Metabolism and Phenotypes slides [pdf] (March 2013)*Inferring Fitness in Finite, Variably-Sized, Evolving, and Dynamically-Structured Populations*, Joint Mathematics Meeting slides [pdf] (Jan 2013)*Time-scale Lyapunov functions for Incentive Dynamics on Riemannian Geometries, with Dashiell Fryer, Joint Mathematics Meeting*slides [pdf] (January 2013)*Quartet partition tests for O(N log N) tree construction*, Bioinformatics Retreat, UCLA (February 2012)*Phenotype Sequencing, Joint Mathematics Meeting*(January 2012)*Information Theory and Geometry in Evolutionary Dynamics*, Info-Evo Seminar, UCLA (November 2011)*Phenotype Sequencing*, IPAM Workshop, UCLA (October 2011)
*Information Gradients and Evolutionary Dynamics*, Seminar at the University of Illinois (July 2010)

*Precalculus Redesign: The Influence of a Placement Program and the Power of a Name*, with Alison Reddy, MAA Mathfest (August 2014)*Panel discussion on online education in mathematics at the Missouri MAA meeting, presentation of results from the placement program at the University of Illinois at a conference in Connecticut. (March 2014)**Identifying Concepts Critical for Success in Calculus at the University of Illinois*[abstract], with Alison Ahlgren Reddy, JMM (Jan 2014)*Identifying Crucial Mathematics Concepts and Skills for Course Success*, with Alison Reddy, AMATYC (Nov 2013)*The Impact of Arithmetic Skills on Success in Calculus II and III*, with Alison Ahlgren Reddy, Joint Mathematics Meetings (Jan 2013)*Connecting student knowledge and course performance at the University of Illinois.*Joint Mathematics Meetings (Jan 2012)*Identifying Crucial Concepts and Skills for Success in College Algebra through Calculus*, with Alison Ahlgren Reddy. Mathfest (Aug 2011)*Connecting student knowledge and course performance at the University of Illinois*, with Alison Ahlgren Reddy. Joint Mathematics Meetings (Jan 2011)*Readiness Assessment and Course Placement through Introductory Calculus*, with Alison Ahlgren Reddy. Joint Mathematics Meetings (Jan 2010)