| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Top

This version was saved 5 years, 5 months ago View current version     Page history
Saved by Irina K
on November 9, 2018 at 4:33:43 am
 

http://i.creativecommons.org/l/by-nc-sa/2.5/ca/88x31.png This Wiki is licensed CC-BY-NC-SA -
Creative Commons

Attribution-Noncommercial-Share Alike 3.0 License.
Authors, learn more about your rights.

 

https://blog.irvingwb.com/blog/2014/04/why-do-we-need-data-science-when-weve-had-statistics-for-centuries.html

 

Why Do We Need Data Science when We’ve Had Statistics for Centuries?

 

Data Science is emerging as as one of the hottest new professions and academic disciplines in these early years of the 21st century.  A number of articles have noted that the demand for data scientists is racing ahead of supply.  People with the necessary skills are scarce, primarily because the discipline is so new.  But, the situation is rapidly changing, as universities around the world have started to offer different kinds of graduate programs in data science.  This year, for example, NYU is offering two new degrees, - a general Master in Data Science, and a more domain-specific Master in Applied Urban Science and Informatics

It’s very exciting to contemplate the emergence of a major new discipline.  It reminds me of the advent of computer science in the 1960s and 1970s.  Like data science, computer science had its roots in a number of related areas, including math, engineering and management.  In its early years, the field attracted people from a variety of other disciplines who started out using computers in their work or studies, and eventually switched to computer science from their original field.

This was the case with me.  I used computers extensively while a student at the University of Chicago, where I worked closely with Professor Clemens Roothaan, - one of the pioneers in the use of computers in physics and chemistry.  As an undergraduate, I worked part-time at the university’s supercomputing center which he founded, and later he was my thesis advisor as a graduate student in physics.  When the time came to look for a job, I realized that I enjoyed the computing side of my work more than the physics.  I decided to switch fields and in 1970 joined the computer science department at IBM’s Watson Research Center.

Not unlike data science today, computing had to overcome the initial resistance of some prominent academics.  I still remember a meeting in 1965 with a very eminent physicist from whom I was taking a graduate course.  He asked me what I planned to do research on for my degree, and I told him that I was already working with Professor Roothaan on atomic and molecular calculations.  He just said that good theoretical physics should require no more than pencil and paper, rather than these elaborate new computers.  In his mind, this wasn’t real physics.  A number of the physics faculty felt the same way.  Change does not come easy, even for brilliant physicists.

 

Computer science has since become a well respected academic discipline.  It has grown extensively since its early days and expanded in many new directions.  It’s quite possible that being around in the early days of computer science and computing in general is part of the reason I’m so interested in the evolution of data science today.  

So, what is data science all about?  One of the best papers on the subject is Data Science and Prediction by Vasant Dhar, - professor in NYU’s Stern School of Business and Director of NYU’s Center for Business Analytics, - which was published in the Communications of the ACM in December, 2013.

“Use of the term data science is increasingly common, as is big data,” Dhar writes in the opening paragraph.  “But what does it mean?  Is there something unique about it?  What skills do data scientists need to be productive in a world deluged by data?  What are the implications for scientific inquiry?”

He defines data science as being essentially the systematic study of the extraction of knowledge from data.  But, analyzing data is something people have been doing with statistics and related methods for a while.  “Why then do we need a new term like data science when we have had statistics for centuries?  The fact that we now have huge amounts of data should not in and of itself justify the need for a new term.”

In short, it’s all about the difference between explaining and predicting.  Data analysis has been generally used as a way of explaining some phenomenon by extracting interesting patterns from individual data sets with well-formulated queries.  Data science, on the other hand, aims to discover and extract actionable knowledge from the data, that is, knowledge that can be used to make decisions and predictions, not just to explain what’s going on.

The raw materials of data science are not independent data sets, no matter how large they are, but heterogeneous, unstructured data set of all kinds, - e.g., text, images, video.  The data scientist will not simply analyze the data, but will look at it from many angles, with the hope of discovering new insights.  

One of the problems with conducting such an in-depth, exploratory analysis is that the multiple data sets that are typically required to do so have are often found within organizational silos, - be they different lines of business in a company, different companies in an industry or different institutions across society at large.  Data science platforms and tools aim to address this problem by working with, linking together and analyzing data sets previously locked away in disparate silos.

“Unlike database querying, which asks What data satisfies this pattern (query)? discovery asks What patterns satisfy this data?,” notes Dhar.  “Specifically, our concern is finding interesting and robust patterns that satisfy the data, where interesting is usually something unexpected and actionable and robust is a pattern expected to occur in the future.”

The article discusses the key skills data scientists should have, starting with machine learning, a complex concept which Dhar explains in a particularly simple way.  

“Most of us are trained to believe theory must originate in the human mind based on prior theory, with data then gathered to demonstrate the validity of the theory.  Machine learning turns this process around.  Given a large trove of data, the computer taunts us by saying, If only you knew what question to ask me, I would give you some very interesting answers based on the data.  Such a capability is powerful since we often do not know what question to ask. . .” 

“Suitably designed machine learning algorithms help find such patterns for us.  To be useful both practically and scientifically, the patterns must be predictive.  The  emphasis on predictability typically favors Occam’s razor, or succinctness, since simpler models are more likely to hold up on future observations than more complex ones, all else being equal. . .”

Data scientists should also have good computer science skills, - including data structures, algorithms, systems and scripting languages, - as well as a good understanding of correlation, causation and related concepts which are central to modeling exercises involving data.

“The final skill set is the least standardized and somewhat elusive and to some extent a craft but also a key differentiator to be an effective data scientist - the ability to formulate problems in a way that results in effective solutions. . . formulation expertise involves the ability to see commonalities across very different problems . . .”

Like computing, one of the most exciting part of data science is that it can be applied to many domains of knowledge.  But, doing so effectively requires domain expertise to identify the important problems to solve in a given area, the kinds of questions we should be asking and the kinds of answers we should be looking for, as well as how to best present whatever insights are discovered so they can be understood by domain practitioners in their own terms.  Garbage-in, garbage-out, a phrase I often heard in the early days of computing, is just as applicable to data science today.

Physics, chemistry, biology and other natural science disciplines have long been practicing their own version of data science.  In physics, for example, “a theory is expected to be complete in the sense a relationship among certain variables is intended to explain the phenomenon completely, with no exceptions. . .  In such domains, the explanatory and predictive models are synonymous.”  

But, given our newfound ability to gather valuable data on almost any topic, prediction can now apply to softer disciplines like the health and social sciences.  Dhar points out that while these fields generally lack solid theories “large amounts of data can result in accurate predictive models, even though no causal insights are immediately apparent.  As long as their prediction errors are small, they could still point us in the right direction for theory development.”

Finally, beyond access to the appropriate skills, are there cultural and management implications in embracing data science in the business world?

“Besides recognizing and nurturing the appropriate skill sets, it requires a shift in managers’ mind-sets toward data-driven decision making to replace or augment intuition and past practices.  A famous quote by 20th-century American statistician W. Edwards Demming - In God we trust, everyone else please bring data - has come to characterize the new orientation, from intuition-based decision making to fact-based decision making. . . It is suddenly possible to test many of their established intuitions, experiment cheaply and accurately, and base decisions on data.  This opportunity requires a fundamental shift in organizational culture, one seen in organizations that have embraced the emerging world of data for decision making.”

 

Things that matter

Among the many reasons you would want to become a data scientist is that you can make a positive contribution to society. Data science can give you some pretty super superpowers. One of them is reshaping industries like healthcare. The amount of data produced about patients and illnesses rises by the second, opening new opportunities for better structured and more informed healthcare. The challenge is to carefully analyze the data in order to be able to recognize problems quickly and accurately – like deepsense.ai did in diagnosing diabetic retinopathy with deep learning.

Did you know that deep learning can help predict dangerous seismic events and keep miners safe? Underground mining is fraught with threats including fires, methane outbreaks or seismic tremors and bumps. An automatic system for predicting and alerting against such dangers is of utmost importance – and also a great challenge for data scientists. Our deepsense.ai team created a machine learning model for the Data Mining Challenge: Predicting Dangerous Seismic Events in Active Coal Mines, which was the winning solution, and one we take great pride in.

Another superpower is saving rare species. When you think of rescuing endangered animals, you see remote jungles and scientists chasing them. This is a stereotype that has changed a lot in recent years. Complex predictive models and algorithms can create insights that help scientists analyze threats to wildlife and create a solution that can save animals – all from the relative comfort of a desk. In fact, it was at our very desktops that we created the Facebook for whales, and It works with 87% accuracy!

 

 



https://deepsense.ai/introduction-to-machine-learning/

 

Why do we need more data scientists and why should you become one?

 

Demand

According to LinkedIn’s 2017 U.S. Emerging Jobs Report, the number of data scientists has grown over 650% since 2012. Yet there are still too few people exploiting the opportunities in this field. Why has it grown so fast?

Companies need to use data to run and grow their everyday business. The fundamental goal of data science is to help companies make quicker and better decisions, which can take them to the top of their market, or at least – especially in the toughest red oceans – be a matter of long-term survival. The number of companies prepared to use big data is increasing. As Dresner Advisory Services laid out in their Big Data Analytics Market Study, forty percent of non-users expect to adopt big data in the next two years.

What is more, you can apply machine learning on smaller data sets, such as ones from a local company’s social media or shopping gift card history. This provides even more opportunities and increases the demand for data scientists. Job growth in the next decade is expected to exceed growth from the past ten years, creating 11.5M jobs by 2026, according to the U.S. Bureau of Labor Statistics. Companies are building up their data science teams to embrace data analytics and will make it integral to their success. Why are these analytics so important? Is it worth working for one of these companies? You will find the answer in the next two chapters.

Influence

Data science changes how decisions are made and companies are adapting a data-driven approach on a huge scale. Data-driven decisions made with advanced data analytics benefit all manner of company, from global behemoths to medium-sized companies down to local businesses looking to get ahead. Lack of data is rarely an issue – mountains of it are collected every single second, and we are beginning to understand the potential and influence it can have. Data sets in the right hands can help predict and shape the future.

The problem is getting data sets to mingle. It is the data scientist’s role to transform organisations from reactive environments with static and aged data, to automated ones that continuously learn in real time. Forecasts are simple – data is a valuable resource and investing in it will definitely pay off.

Tractica forecasts that worldwide revenue from deployments of AI software, hardware, and services will increase from $14.9 billion in 2017 to $23.6 billion in 2018, a year-over-year increase of 58%.

Do we need more data scientists?

Now, knowing that data science is in huge demand, you are probably wondering who is going to do all the work. Do we have enough data scientists? Maybe the market is already flush with experts. Nothing could be further from the truth – data scientists are few and far between, and highly sought after. IBM predicts demand for data scientists will soar 28% by 2020. Machine learning and data science are generating more jobs than there are experts to fill them, which is why these two fields are the fastest growing tech employment areas today.

Why should you become a data scientist?

Let’s start from the bottom of Maslow’s pyramid of human needs, which you secure with money. According to Glassdoor, in 2016 data science was the single highest paid profession. If data is money, as they say, then this should come as no surprise. The combination of skills necessary to do data science the right way is not common. The good news, however, is that if you want to become a data scientist and are willing to develop yourself, you are very likely to succeed. A background in mathematics, statistics or physics is a good foundation to build upon. You don’t necessarily need to have finished a data science program. We write a lot about learning methods on our blog, which you’ll see if you read our next post. Sign up for our newsletter if you would like to be updated.

Make the world easier

Besides its financial and economic aspects, data science is simply a fascinating discipline, one which affects many areas of our everyday lives and makes the world a better place. We already use it in many fields, such as quick and easy customer service, intelligent navigation, recommendations and voice-to-text. You can even improve the resolution of an image with deep learning.
We don’t have space enough to chronicle as of the ways that data science is improving people’s lives. It is indispensable to the banking sector as it is used to detect fraud by analyzing the behavior of financial institutions in real time. Elsewhere, robots will be used to help the elderly and the disabled gain mobility and independence. Data science makes these breakthroughs accessible to individuals, solves social problems and modernizes business. Most importantly, you can take part in the revolution data science is bringing about.


 

 

Welcome to LikeInMind! 

Directory

Arts

Movies, Television, Music ...

Business

Jobs, Real Estate, Investing...

Computers

Internet, Software, Hardware...

Games

Video Games, RPGs, Gambling...

Health

Fitness, Medicine, Alternative...

Home

Family, Consumers, Cooking...

Kids and Teens

Arts, School Time, Teen Life...

News

Media, Newspapers, Weather...

Recreation

Travel, Food, Outdoors, Humor...

Reference

Maps, Education, Libraries...

Regional

NZRussia, US, ...

Science

Biology, Psychology, Physics...

Shopping

Clothing, Food, Gifts...

Society

People, Religion, Issues...

Sports

Baseball, Soccer, Basketball...

World

Català, Česky, Dansk, Deutsch, Español, Esperanto, Français, Galego, Hrvatski, Italiano, Lietuvių, Magyar, Nederlands, Norsk, Polski, Português, Română, Slovensky, Suomi, Svenska, Türkçe, Български, Ελληνικά, Русский, Українська, עברית‭, ‬العربية, ไทย, 日本語, 简体中文, 繁體中文, …

 

 

Projects   

"We do stuff ... and tell stories..." Esteban Trev 

"Persistence and Consistency (in following expectations)" David Braden

"Intent. Patience. Persistence"   Michael Josefowicz 

PLAST 
World Brain 2 
Online Planning Game 
SkillMatcher NZ 
 Language Metaphor MJ +ET collaboration  
What leadership means - STN
The Creative Mind
From  Academic Tome to Commercial Option

Skill Matcher

MJ Home

 

Participants

  A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
R
S
T
U
V
Y
X
Y
Z
A                     AK           AR
               
B                                                  
C                                                  
D                                   DVS              
E                                     ET
           
F                                                  
G                                                  
H                                                  
I                                                  
J                                                  
K                                                  
L                                                  
M                   MJ
                             
N                                                  
O                                                  
P                                                  
R                                                  
S                                                  
T                                                  
U                                                  
V                                                  
W                                                  
X                                                  
Y                                                  
Z                                                  

 

 

 

 

 

http://i.creativecommons.org/l/by-nc-sa/2.5/ca/88x31.png This Wiki is licensed CC-BY-NC-SA -
Creative Commons

Attribution-Noncommercial-Share Alike 3.0 License.
Authors, learn more about your rights.

Comments (0)

You don't have permission to comment on this page.