August, 21 2018
Daniel Jeske, University of California, Riverside and David Steinberg, Tel-Aviv University
A panel discussion was held on July 31, 2018, at the Joint Statistical Meetings held in Vancouver, Canada, on the theme “A Life Cycle View of Statistics”. The panelists were John Peterson, Laura Freeman, Ron Kenett and Vijay Nair, and the session was moderated by David Steinberg. An audience of between 50 and 100 JSM registrants were in attendance.
JSM 2018 had adopted #LeadWithStatistics as the conference theme. In line with that vision, a panel was organized on the evolving roles for Statisticians in a world that is increasingly data focused. It was designed as a platform for presenting ideas and experiences that will help clarify the role of Statistics and of Statisticians and promote our leadership potential.
The topics discussed included:
- What is the role of Statisticians in the Big Data era?
- What is needed for Statisticians to stand out as leaders?
- What skills are required by Statisticians in this modern era? What past knowledge areas can be dropped?
- Are Statisticians good at producing information of quality? If not, what should be done to improve this situation?
- How should Statistics be taught to support future needs from Statisticians?
- Are Statistics and Data Science complementary? How?
The panel was moderated by Dr. David Steinberg from Tel-Aviv University. Dr. Steinberg completed his Ph.D. at the University of Wisconsin-Madison under George Box. His research interests include experimental design, biostatistics and industrial statistics. The speakers were Dr. John Peterson, Dr. Laura Freeman, Dr. Ron Kenett and Dr. Vijay Nair. John Peterson is a Senior Statistics Director in the R&D Division of GlaxoSmithKline Pharmaceuticals, where he provides statistical consulting for nonclinical development. Laura Freeman is the Assistant Director in the Operational Evaluation Division of the Institute of Defense Analyses, and advises on defense policy and guidance for test and evaluation of military systems. Ron Kenett is the Chairman of the KPA Group, Senior Research Fellow at the Neman Institute, Technion and past Research Professor at the Mathematics Department, University of Turin, Italy. He is an applied statistician with a keen interest in industrial statistics. Vijay Nair is the Head of the Statistical Learning and Advanced Computing Group in Corporate Model Risk at Wells Fargo. He previously was the Donald A. Darling Collegiate Professor of Statistics and Professor of Industrial and Operations Engineering at the University of Michigan, Ann Arbor.
The Role of a Statistician
The panel speakers spoke broadly about the changes we are seeing in how statisticians are being asked to work. Laura Freeman called for statisticians to play a role in institutionalizing statistical thinking within organizations, and suggested that leadership and communication skills are more important than technical skills when trying to enable such a transformation. She cited straightforward, but critical, concepts like comparable conditions for A/B testing and sample size considerations as being easily overlooked when a statistician is not involved. Ron Kenett recalled George Box saying statistics is about scientific investigation and urging statisticians to be first-rate scientists. He offered an 8-step life-cycle view of statistics as a methodology for fulfilling Box’s vision. Vijay Nair stated that it is not possible to define a statistician anymore, and instead we should focus on defining the field of statistics. His view is that being at the table and being a mover and a shaker has little to do with your degree or discipline and more to do with your individual personality and what type of person you are. John Peterson pointed out that increasing technological complexity and diversity is driving the need for knowledge and skill diversity for statisticians. He mentioned Martec’s Law, which says technology changes exponentially while organizations change logarithmically.
The interaction between statistics and data science was one of the major topics addressed by the panel. Although “data science” seems like a recent term, and an ongoing revolution, Ron Kenett provided an interesting historical perspective by reviewing its much earlier foreshadowing. Specifically, he reminded that John Tukey wrote a 1962 paper in the Annals of Mathematical Statistics entitled, “The Future of Data Analysis,” where he pointed to the existence of an as-yet unrecognized science, whose subject of interest was learning from data, or “data analysis.” Jeff Wu upon his inauguration in 1997 as Harry Carver Professor of Statistics at University of Michigan, presented an inaugural lecture titled “Statistics = Data Science?” in which he advocated that statistics be renamed data science and statisticians as data scientists. Vijay Nair noted that Bill Cleveland had published a paper in the International Statistical Review laying out a broader mission for statisticians and called it Data Science. Nair also mentioned Leo Brieman’s 2001 article in Statistical Science, “Statistical Modeling: The Two Cultures,” as another portend of how statistical thinking has been going through a revolution. John Peterson submitted that “statistics” (broadly speaking) may be experiencing some growing pains, as it is a relatively younger discipline than, say, chemistry. As such, areas of natural specialization (e.g. computing aspects of big data, causal inference, computer graphics, etc.) are maturing and struggling to acquire appreciation for their sub-fields among the broader landscape of data science/statistics. But, this is good as long as people in these areas work together synergistically to solve problems.
A question from the audience on how academic departments should adapt their curriculum to improve training in statistics generated interesting discussion, but no clear prescription as to what courses to remove and what courses to add was expressed. Vijay Nair felt that important data science skills will have to be learned “by doing,” which hints toward project-based courses. He noted that often much of the key data science work goes into cleaning, preprocessing and engineering the data to reach a state where it can be analyzed. Laura Freeman felt service courses in statistics are becoming irrelevant, and that instead students need to be taught how to ask the right questions. Ron Kenett mentioned that research on topics such as problem elicitation is needed. He also addressed the challenge of fitting more topics in a crowded curriculum by considering improvements in the education process. It was agreed upon, however, that Universities need to continue to move toward data science training even if an optimal path for doing so is not recognized today.
A common theme was that statisticians need to evolve in parallel with technological advances in order to stay relevant. Laura Freeman mentioned the importance of data visualization skills. She reviewed Kotter’s process for leading change, and spoke about the difference between innovation and innovation adoption. In her concluding remarks she noted that the current churn that the field of statistics is experiencing makes it an ideal time for statisticians to establish themselves as data leaders within their organizations, and that it is a good time to be in data analysis.
Ron Kenett emphasized the need for statisticians to turn data into information quality to make better decisions and stronger organizations. He mentioned he is working on a new book with Tom Redman that has a title along these lines. In his concluding remarks he opined that statistics should produce tools and methods for turning data into information of quality and that students need to be trained to contribute this way.
Vijay Nair stressed the need for competency in database management and modern computing skills including Hadoop, SQL, Python, etc. In his prepared comments, he described the huge role that quantitative modeling plays in the modern banking industry. He mentioned that SAS is still an important software package for the industry, and commented on emerging challenges that include deficiency of traditional methods for large data sets, the rapid adoption of artificial intelligence methodologies and the use of natural language processing in applications. In his concluding remarks he said statisticians have the potential to be good data scientists and suggested it is useful to think about the life cycle of data.
John Peterson pointed out that statisticians cannot possess deep knowledge of all the statistical methods or scientific issues, and as such it is important to build strong networks of colleagues to round out your own knowledge areas. This is becoming increasingly important as the pace of technological development is accelerating, and new technology is driving the need for statistical methods to adapt. He offered advice to scan the horizon of our statistics field, as well as our clients’ fields, and anticipate what might become important within your job environment so that you can pro-actively train in an area that has become newly required. He encouraged statisticians to attend meetings outside of statistics that are important to their clients. In his concluding remarks he suggested that understanding the genesis of data and design of experiments is what sets statisticians apart.