1) In the opening vignette, the CERN Data Aggregation System (DAS), built on MongoDB (a Big Data management infrastructure), used relational database technology.
Answer: FALSE
Diff: 2 Page Ref: 277
2) The term “Big Data” is relative as it depends on the size of the using organization.
Answer: TRUE
Diff: 2 Page Ref: 279
3) In the Luxottica case study, outsourcing enhanced the ability of the company to gain insights into their data.
Answer: FALSE
Diff: 2 Page Ref: 283-284
4) Many analytics tools are too complex for the average user, and this is one justification for Big Data.
Answer: TRUE
Diff: 2 Page Ref: 284
5) In the investment bank case study, the major benefit brought about by the supplanting of multiple databases by the new trade operational store was providing real-time access to trading data.
Answer: TRUE
Diff: 2 Page Ref: 288
6) Big Data uses commodity hardware, which is expensive, specialized hardware that is custom built for a client or application.
Answer: FALSE
Diff: 2 Page Ref: 289
7) MapReduce can be easily understood by skilled programmers due to its procedural nature.
Answer: TRUE
Diff: 2 Page Ref: 291
8) Hadoop was designed to handle petabytes and extabytes of data distributed over multiple nodes in parallel.
Answer: TRUE
Diff: 2 Page Ref: 291
9) Hadoop and MapReduce require each other to work.
Answer: FALSE
Diff: 2 Page Ref: 295
10) In most cases, Hadoop is used to replace data warehouses.
Answer: FALSE
Diff: 2 Page Ref: 295
11) Despite their potential, many current NoSQL tools lack mature management and monitoring tools.
Answer: TRUE
Diff: 2 Page Ref: 295
12) The data scientist is a profession for a field that is still largely being defined.
Answer: TRUE
Diff: 2 Page Ref: 298
13) There is a current undersupply of data scientists for the Big Data market.
Answer: TRUE
Diff: 2 Page Ref: 300
14) The Big Data and Analysis in Politics case study makes it clear that the unpredictability of elections makes politics an unsuitable arena for Big Data.
Answer: FALSE
Diff: 2 Page Ref: 301
15) For low latency, interactive reports, a data warehouse is preferable to Hadoop.
Answer: TRUE
Diff: 2 Page Ref: 306
16) If you have many flexible programming languages running in parallel, Hadoop is preferable to a data warehouse.
Answer: TRUE
Diff: 2 Page Ref: 306
17) In the Dublin City Council case study, GPS data from the city’s buses and CCTV were the only data sources for the Big Data GIS-based application.
Answer: FALSE
Diff: 2 Page Ref: 309-310
18) It is important for Big Data and self-service business intelligence go hand in hand to get maximum value from analytics.
Answer: TRUE
Diff: 1 Page Ref: 313
19) Big Data simplifies data governance issues, especially for global firms.
Answer: FALSE
Diff: 2 Page Ref: 313
20) Current total storage capacity lags behind the digital information being generated in the world.
Answer: TRUE
Diff: 2 Page Ref: 315
21) Using data to understand customers/clients and business operations to sustain and foster
growth and profitability is
- A) easier with the advent of BI and Big Data.
- B) essentially the same now as it has always been.
- C) an increasingly challenging task for today’s enterprises.
- D) now completely automated with no human intervention required.
Answer: C
Diff: 2 Page Ref: 279
22) A newly popular unit of data in the Big Data era is the petabyte (PB), which is
- A) 109
- B) 1012
- C) 1015
- D) 1018
Answer: C
Diff: 2 Page Ref: 281
23) Which of the following sources is likely to produce Big Data the fastest?
- A) order entry clerks
- B) cashiers
- C) RFID tags
- D) online customers
Answer: C
Diff: 2 Page Ref: 281-282
24) Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called?
- A) volatility
- B) periodicity
- C) inconsistency
- D) variability
Answer: D
Diff: 2 Page Ref: 282
25) In the Luxottica case study, what technique did the company use to gain visibility into its customers?
- A) visibility analytics
- B) data integration
- C) focus on growth
- D) customer focus
Answer: B
Diff: 2 Page Ref: 283-284
26) Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near—real time with highly accurate insights. What is this process called?
- A) in-memory analytics
- B) in-database analytics
- C) grid computing
- D) appliances
Answer: A
Diff: 2 Page Ref: 286
27) Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources?
- A) in-memory analytics
- B) in-database analytics
- C) grid computing
- D) appliances
Answer: C
Diff: 2 Page Ref: 286
28) How does Hadoop work?
- A) It integrates Big Data into a whole so large data elements can be processed as a whole on one computer.
- B) It integrates Big Data into a whole so large data elements can be processed as a whole on multiple computers.
- C) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on one computer.
- D) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers.
Answer: D
Diff: 3 Page Ref: 291
29) What is the Hadoop Distributed File System (HDFS) designed to handle?
- A) unstructured and semistructured relational data
- B) unstructured and semistructured non-relational data
- C) structured and semistructured relational data
- D) structured and semistructured non-relational data
Answer: B
Diff: 2 Page Ref: 291
30) In a Hadoop “stack,” what is a slave node?
- A) a node where bits of programs are stored
- B) a node where metadata is stored and used to organize data processing
- C) a node where data is stored and processed
- D) a node responsible for holding all the source programs
Answer: C
Diff: 2 Page Ref: 292
31) In a Hadoop “stack,” what node periodically replicates and stores data from the Name Node should it fail?
- A) backup node
- B) secondary node
- C) substitute node
- D) slave node
Answer: B
Diff: 2 Page Ref: 292
32) All of the following statements about MapReduce are true EXCEPT
- A) MapReduce is a general-purpose execution engine.
- B) MapReduce handles the complexities of network communication.
- C) MapReduce handles parallel programming.
- D) MapReduce runs without fault tolerance.
Answer: D
Diff: 2 Page Ref: 295
33) In the Big Data and Analytics in Politics case study, which of the following was an input to the analytic system?
- A) census data
- B) assessment of sentiment
- C) voter mobilization
- D) group clustering
Answer: A
Diff: 2 Page Ref: 301
34) In the Big Data and Analytics in Politics case study, what was the analytic system output or goal?
- A) census data
- B) assessment of sentiment
- C) voter mobilization
- D) group clustering
Answer: C
Diff: 2 Page Ref: 301
35) Traditional data warehouses have not been able to keep up with
- A) the evolution of the SQL language.
- B) the variety and complexity of data.
- C) expert systems that run on them.
- D) OLAP.
Answer: B
Diff: 2 Page Ref: 303
36) Under which of the following requirements would it be more appropriate to use Hadoop over a data warehouse?
- A) ANSI 2003 SQL compliance is required
- B) online archives alternative to tape
- C) unrestricted, ungoverned sandbox explorations
- D) analysis of provisional data
Answer: C
Diff: 2 Page Ref: 306
37) What is Big Data’s relationship to the cloud?
- A) Hadoop cannot be deployed effectively in the cloud just yet.
- B) Amazon and Google have working Hadoop cloud offerings.
- C) IBM’s homegrown Hadoop platform is the only option.
- D) Only MapReduce works in the cloud; Hadoop does not.
Answer: B
Diff: 2 Page Ref: 308
38) Companies with the largest revenues from Big Data tend to be
- A) the largest computer and IT services firms.
- B) small computer and IT services firms.
- C) pure open source Big Data firms.
- D) non-U.S. Big Data firms.
Answer: A
Diff: 2 Page Ref: 311
39) In the health sciences, the largest potential source of Big Data comes from
- A) accounting systems.
- B) human resources.
- C) patient monitoring.
- D) research administration.
Answer: C
Diff: 2 Page Ref: 320
40) In the Discovery Health insurance case study, the analytics application used available data to help the company do all of the following EXCEPT
- A) predict customer health.
- B) detect fraud.
- C) lower costs for members.
- D) open its own pharmacy.
Answer: D
Diff: 2 Page Ref: 323-324
41) Most Big Data is generated automatically by ________.
Answer: machines
Diff: 2 Page Ref: 279
42) ________ refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of the data.
Answer: Veracity
Diff: 2 Page Ref: 282
43) In-motion ________ is often overlooked today in the world of BI and Big Data.
Answer: analytics
Diff: 2 Page Ref: 282
44) The ________ of Big Data is its potential to contain more useful patterns and interesting anomalies than “small” data.
Answer: value proposition
Diff: 2 Page Ref: 282
45) As the size and the complexity of analytical systems increase, the need for more ________ analytical systems is also increasing to obtain the best performance.
Answer: efficient
Diff: 2 Page Ref: 286
46) ________ speeds time to insights and enables better data governance by performing data integration and analytic functions inside the database.
Answer: In-database analytics
Diff: 2 Page Ref: 286
47) ________ bring together hardware and software in a physical unit that is not only fast but also scalable on an as-needed basis.
Answer: Appliances
Diff: 2 Page Ref: 286
48) Big Data employs ________ processing techniques and nonrelational data storage capabilities in order to process unstructured and semistructured data.
Answer: parallel
Diff: 2 Page Ref: 289
49) In the world of Big Data, ________ aids organizations in processing and analyzing large volumes of multi-structured data. Examples include indexing and search, graph analysis, etc.
Answer: MapReduce
Diff: 2 Page Ref: 291
50) The ________ Node in a Hadoop cluster provides client information on where in the cluster particular data is stored and if any nodes fail.
Answer: Name
Diff: 2 Page Ref: 292
51) A job ________ is a node in a Hadoop cluster that initiates and coordinates MapReduce jobs, or the processing of the data.
Answer: tracker
Diff: 2 Page Ref: 292
52) HBase is a nonrelational ________ that allows for low-latency, quick lookups in Hadoop.
Answer: database
Diff: 2 Page Ref: 293
53) Hadoop is primarily a(n) ________ file system and lacks capabilities we’d associate with a DBMS, such as indexing, random access to data, and support for SQL.
Answer: distributed
Diff: 2 Page Ref: 294
54) HBase, Cassandra, MongoDB, and Accumulo are examples of ________ databases.
Answer: NoSQL
Diff: 2 Page Ref: 295
55) In the eBay use case study, load ________ helped the company meet its Big Data needs with the extremely fast data handling and application availability requirements.
Answer: balancing
Diff: 2 Page Ref: 296
56) As volumes of Big Data arrive from multiple sources such as sensors, machines, social media, and clickstream interactions, the first step is to ________ all the data reliably and cost effectively.
Answer: capture
Diff: 2 Page Ref: 303
57) In open-source databases, the most important performance enhancement to date is the cost-based ________.
Answer: optimizer
Diff: 2 Page Ref: 304
58) Data ________ or pulling of data from multiple subject areas and numerous applications into one repository is the raison d’être for data warehouses.
Answer: integration
Diff: 2 Page Ref: 305
59) In the energy industry, ________ grids are one of the most impactful applications of stream analytics.
Answer: smart
Diff: 2 Page Ref: 315
60) In the U.S. telecommunications company case study, the use of analytics via dashboards has helped to improve the effectiveness of the company’s ________ assessments and to make their systems more secure.
Answer: threat
Diff: 2 Page Ref: 319
61) In the opening vignette, what is the source of the Big Data collected at the European Organization for Nuclear Research or CERN?
Answer: Forty million times per second, particles collide within the LHC, each collision generating particles that often decay in complex ways into even more particles. Precise electronic circuits all around LHC record the passage of each particle via a detector as a series of electronic signals, and send the data to the CERN Data Centre (DC) for recording and digital reconstruction. The digitized summary of data is recorded as a “collision event.” 15 petabytes or so of digitized summary data produced annually and this is processed by physicists to determine if the collisions have thrown up any interesting physics.
Diff: 2 Page Ref: 276
62) List and describe the three main “V”s that characterize Big Data.
Answer:
∙ Volume: This is obviously the most common trait of Big Data. Many factors contributed to the exponential increase in data volume, such as transaction-based data stored through the years, text data constantly streaming in from social media, increasing amounts of sensor data being collected, automatically generated RFID and GPS data, and so forth.
∙ Variety: Data today comes in all types of formats–ranging from traditional databases to hierarchical data stores created by the end users and OLAP systems, to text documents, e-mail, XML, meter-collected, sensor-captured data, to video, audio, and stock ticker data. By some estimates, 80 to 85 percent of all organizations’ data is in some sort of unstructured or semistructured format.
∙ Velocity: This refers to both how fast data is being produced and how fast the data must be processed (i.e., captured, stored, and analyzed) to meet the need or demand. RFID tags, automated sensors, GPS devices, and smart meters are driving an increasing need to deal with torrents of data in near—real time.
Diff: 2 Page Ref: 280-281
63) List and describe four of the most critical success factors for Big Data analytics.
Answer:
∙ A clear business need (alignment with the vision and the strategy). Business investments ought to be made for the good of the business, not for the sake of mere technology advancements. Therefore the main driver for Big Data analytics should be the needs of the business at any level–strategic, tactical, and operations.
∙ Strong, committed sponsorship (executive champion). It is a well-known fact that if you don’t have strong, committed executive sponsorship, it is difficult (if not impossible) to succeed. If the scope is a single or a few analytical applications, the sponsorship can be at the departmental level. However, if the target is enterprise-wide organizational transformation, which is often the case for Big Data initiatives, sponsorship needs to be at the highest levels and organization-wide.
∙ Alignment between the business and IT strategy. It is essential to make sure that the analytics work is always supporting the business strategy, and not other way around. Analytics should play the enabling role in successful execution of the business strategy.
∙ A fact-based decision making culture. In a fact-based decision-making culture, the numbers rather than intuition, gut feeling, or supposition drive decision making. There is also a culture of experimentation to see what works and doesn’t. To create a fact-based decision-making culture, senior management needs to do the following: recognize that some people can’t or won’t adjust; be a vocal supporter; stress that outdated methods must be discontinued; ask to see what analytics went into decisions; link incentives and compensation to desired behaviors.
∙ A strong data infrastructure. Data warehouses have provided the data infrastructure for analytics. This infrastructure is changing and being enhanced in the Big Data era with new technologies. Success requires marrying the old with the new for a holistic infrastructure that works synergistically.
Diff: 2 Page Ref: 285-286
64) When considering Big Data projects and architecture, list and describe five challenges designers should be mindful of in order to make the journey to analytics competency less stressful.
Answer:
∙ Data volume: The ability to capture, store, and process the huge volume of data at an acceptable speed so that the latest information is available to decision makers when they need it.
∙ Data integration: The ability to combine data that is not similar in structure or source and to do so quickly and at reasonable cost.
∙ Processing capabilities: The ability to process the data quickly, as it is captured. The traditional way of collecting and then processing the data may not work. In many situations data needs to be analyzed as soon as it is captured to leverage the most value.
∙ Data governance: The ability to keep up with the security, privacy, ownership, and quality issues of Big Data. As the volume, variety (format and source), and velocity of data change, so should the capabilities of governance practices.
∙ Skills availability: Big Data is being harnessed with new tools and is being looked at in different ways. There is a shortage of data scientists with the skills to do the job.
∙ Solution cost: Since Big Data has opened up a world of possible business improvements, there is a great deal of experimentation and discovery taking place to determine the patterns that matter and the insights that turn to value. To ensure a positive ROI on a Big Data project, therefore, it is crucial to reduce the cost of the solutions used to find that value.
Diff: 3 Page Ref: 286-287
65) Define MapReduce.
Answer: As described by Dean and Ghemawat (2004), “MapReduce is a programming model and an associated implementation for processing and generating large data sets. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system.”
Diff: 2 Page Ref: 289-290
66) What is NoSQL as used for Big Data? Describe its major downsides.
Answer:
∙ NoSQL is a new style of database that has emerged to, like Hadoop, process large volumes of multi-structured data. However, whereas Hadoop is adept at supporting large-scale, batch-style historical analysis, NoSQL databases are aimed, for the most part (though there are some important exceptions), at serving up discrete data stored among large volumes of multi-structured data to end-user and automated Big Data applications. This capability is sorely lacking from relational database technology, which simply can’t maintain needed application performance levels at Big Data scale.
∙ The downside of most NoSQL databases today is that they trade ACID (atomicity, consistency, isolation, durability) compliance for performance and scalability. Many also lack mature management and monitoring tools.
Diff: 2 Page Ref: 295
67) What is a data scientist and what does the job involve?
Answer: A data scientist is a role or a job frequently associated with Big Data or data science. In a very short time it has become one of the most sought-out roles in the marketplace. Currently, data scientists’ most basic, current skill is the ability to write code (in the latest Big Data languages and platforms). A more enduring skill will be the need for data scientists to communicate in a language that all their stakeholders understand–and to demonstrate the special skills involved in storytelling with data, whether verbally, visually, or–ideally–both. Data scientists use a combination of their business and technical skills to investigate Big Data looking for ways to improve current business analytics practices (from descriptive to predictive and prescriptive) and hence to improve decisions for new business opportunities.
Diff: 2 Page Ref: 297-298
68) Why are some portions of tape backup workloads being redirected to Hadoop clusters today?
Answer:
∙ First, while it may appear inexpensive to store data on tape, the true cost comes with the difficulty of retrieval. Not only is the data stored offline, requiring hours if not days to restore, but tape cartridges themselves are also prone to degradation over time, making data loss a reality and forcing companies to factor in those costs. To make matters worse, tape formats change every couple of years, requiring organizations to either perform massive data migrations to the newest tape format or risk the inability to restore data from obsolete tapes.
∙ Second, it has been shown that there is value in keeping historical data online and accessible. As in the clickstream example, keeping raw data on a spinning disk for a longer duration makes it easy for companies to revisit data when the context changes and new constraints need to be applied. Searching thousands of disks with Hadoop is dramatically faster and easier than spinning through hundreds of magnetic tapes. Additionally, as disk densities continue to double every 18 months, it becomes economically feasible for organizations to hold many years’ worth of raw or refined data in HDFS.
Diff: 2 Page Ref: 304
69) What are the differences between stream analytics and perpetual analytics? When would you use one or the other?
Answer:
∙ In many cases they are used synonymously. However, in the context of intelligent systems, there is a difference. Streaming analytics involves applying transaction-level logic to real-time observations. The rules applied to these observations take into account previous observations as long as they occurred in the prescribed window; these windows have some arbitrary size (e.g., last 5 seconds, last 10,000 observations, etc.). Perpetual analytics, on the other hand, evaluates every incoming observation against all prior observations, where there is no window size. Recognizing how the new observation relates to all prior observations enables the discovery of real-time insight.
∙ When transactional volumes are high and the time-to-decision is too short, favoring nonpersistence and small window sizes, this translates into using streaming analytics. However, when the mission is critical and transaction volumes can be managed in real time, then perpetual analytics is a better answer.
Diff: 2 Page Ref: 315-316
70) Describe data stream mining and how it is used.
Answer: Data stream mining, as an enabling technology for stream analytics, is the process of extracting novel patterns and knowledge structures from continuous, rapid data records. A data stream is a continuous flow of ordered sequence of instances that in many applications of data stream mining can be read/processed only once or a small number of times using limited computing and storage capabilities. Examples of data streams include sensor data, computer network traffic, phone conversations, ATM transactions, web searches, and financial data. Data stream mining can be considered a subfield of data mining, machine learning, and knowledge discovery. In many data stream mining applications, the goal is to predict the class or value of new instances in the data stream given some knowledge about the class membership or values of previous instances in the data stream.
Diff: 2 Page Ref: 317
Business Intelligence, 3e (Sharda/Delen/Turban)
Chapter 7 Business Analytics: Emerging Trends and Future Directions
1) Oklahoma Gas & Electric employs a two-layer information architecture involving data warehouse and improved and expanded integration.
Answer: FALSE
Diff: 2 Page Ref: 328
2) In the classification of location-based analytic applications, examining geographic site locations falls in the consumer-oriented category.
Answer: FALSE
Diff: 2 Page Ref: 330
3) In the Great Clips case study, the company uses geospatial data to analyze, among other things, the types of haircuts most popular in different geographic locations.
Answer: FALSE
Diff: 2 Page Ref: 331-332
4) From massive amounts of high-dimensional location data, algorithms that reduce the dimensionality of the data can be used to uncover trends, meaning, and relationships to eventually produce human-understandable representations.
Answer: TRUE
Diff: 2 Page Ref: 333
5) In the life coach case study, Kaggle recently hosted a competition aimed at identifying muscle motions that may be used to predict the progression of Alzheimer’s disease.
Answer: TRUE
Diff: 2 Page Ref: 336
6) Content-based filtering approaches are widely used in recommending textual content such as news items and related Web pages.
Answer: TRUE
Diff: 2 Page Ref: 339
7) The basic premise behind social networking is that it gives people the power to share, making the world more open and connected.
Answer: TRUE
Diff: 2 Page Ref: 340
8) Cloud computing originates from a reference to the Internet as a “cloud” and is a combination of several information technology components as services.
Answer: TRUE
Diff: 2 Page Ref: 342
9) Web-based e-mail such as Google’s Gmail are not examples of cloud computing.
Answer: FALSE
Diff: 2 Page Ref: 342
10) Service-oriented DSS solutions generally offer individual or bundled services to the user as a service.
Answer: TRUE
Diff: 2 Page Ref: 343
11) In service-oriented DSS, an application programming interface (API) serves to populate source systems with raw data and to pull operational reports.
Answer: TRUE
Diff: 2 Page Ref: 344
12) Data-as-a-service began with the notion that data quality could happen in a centralized place, cleansing and enriching data and offering it to different systems, applications, or users, irrespective of where they were in the organization, computers, or on the network.
Answer: TRUE
Diff: 2 Page Ref: 346
13) IaaS helps provide faster information, but provides information only to managers in an organization.
Answer: FALSE
Diff: 2 Page Ref: 346
14) The trend in the consumption of data analytics is away from in-memory solution and towards mobile devices.
Answer: FALSE
Diff: 2 Page Ref: 347
15) While cloud services are useful for small and midsize analytic applications, they are still limited in their ability to handle Big Data applications.
Answer: FALSE
Diff: 2 Page Ref: 348
16) Analytics integration with other organizational systems makes it harder to identify its impact on the organization.
Answer: TRUE
Diff: 2 Page Ref: 348
17) One way in which computerization has benefitted organizations is by reducing information anxiety.
Answer: FALSE
Diff: 2 Page Ref: 350
18) ES/DSS were found to improve the performance of new managers but not existing managers.
Answer: FALSE
Diff: 2 Page Ref: 350
19) Use of automated decision systems (ADSs) is likely to result in a reduction of middle management.
Answer: TRUE
Diff: 1 Page Ref: 351
20) In designing analytic systems, it must be kept in mind that the right to an individual’s privacy is not absolute.
Answer: TRUE
Diff: 2 Page Ref: 352
21) What kind of location based analytics is real-time marketing promotion?
- A) organization-oriented geospatial static approach
- B) organization-oriented location-based dynamic approach
- C) consumer-oriented geospatial static approach
- D) consumer-oriented location-based dynamic approach
Answer: B
Diff: 2 Page Ref: 330
22) GPS Navigation is an example of which kind of location based analytics?
- A) organization-oriented geospatial static approach
- B) organization-oriented location-based dynamic approach
- C) consumer-oriented geospatial static approach
- D) consumer-oriented location-based dynamic approach
Answer: C
Diff: 2 Page Ref: 330
23) What new geometric data type in Teradata’s data warehouse captures geospatial features?
- A) NAVTEQ
- B) ST_GEOMETRY
- C) GIS
- D) SQL/MM
Answer: B
Diff: 2 Page Ref: 331
24) A British company called Path Intelligence has developed a system that ascertains how people move within a city or even within a store. What is this system called?
- A) Pathfinder
- B) PathMiner
- C) Footpath
- D) Pathdata
Answer: C
Diff: 2 Page Ref: 333
25) Today, most smartphones are equipped with various instruments to measure jerk, orientation, and sense motion. One of these instruments is an accelerometer, and the other is a(n)
- A) potentiometer.
- B) gyroscope.
- C) microscope.
- D) oscilloscope.
Answer: B
Diff: 2 Page Ref: 336
26) Content-based filtering obtains detailed information about item characteristics and restricts this process to a single user using information tags or
- A) keywords.
- B) passphrases.
- C) key-pairs.
- D) reality mining.
Answer: A
Diff: 2 Page Ref: 339
27) Service-oriented thinking is one of the fastest growing paradigms in today’s economy. Which of the following is NOT a characteristic of service-oriented DSS?
- A) reusability
- B) substitutability
- C) extensibility
- D) originality
Answer: D
Diff: 2 Page Ref: 343
28) All of the following are components in a service-oriented DSS environment EXCEPT
- A) information technology as enabler.
- B) data as infrastructure.
- C) process as beneficiary.
- D) people as user.
Answer: B
Diff: 2 Page Ref: 343
29) Which component of service-oriented DSS can be defined as data that describes the meaning and structure of business data, as well as how it is created, accessed, and used?
- A) application programming interface
- B) analytics
- C) operations and administration
- D) metadata management
Answer: D
Diff: 2 Page Ref: 344
30) Which component of service-oriented DSS can be described as a subset of a data warehouse that supports specific decision and analytical needs and provides business units more flexibility, control, and responsibility?
- A) information delivery portals
- B) information services with library and administrator
- C) extract, transform, load
- D) data marts
Answer: D
Diff: 2 Page Ref: 345
31) Which component of service-oriented DSS can be described as optimizing the DSS environment use by organizing its capabilities and knowledge, and assimilating them into the business processes?
- A) information delivery portals
- B) information services with library and administrator
- C) extract, transform, load
- D) data marts
Answer: B
Diff: 2 Page Ref: 345
32) Which component of service-oriented DSS includes such examples as optimization, data mining, text mining, simulation, automated decision systems?
- A) application programming interface
- B) analytics
- C) operations and administration
- D) metadata management
Answer: B
Diff: 2 Page Ref: 345
33) Which of the following is true of data-as-a-Service (DaaS) platforms?
- A) Knowing where the data resides is critical to the functioning of the platform.
- B) There are standardized processes for accessing data wherever it is located.
- C) Business processes can access local data only.
- D) Data quality happens on each individual platform.
Answer: B
Diff: 2 Page Ref: 345-346
34) Which of the following offers a flexible data integration platform based on a newer generation of service-oriented standards that enables ubiquitous access to any type of data?
- A) EAI
- B) EII
- C) IaaS
- D) ETL
Answer: C
Diff: 2 Page Ref: 346-347
35) When new analytics applications are introduced and affect multiple related processes and departments, the organization is best served by utilizing
- A) business flow management.
- B) multi-department analysis.
- C) process flow analysis.
- D) business process reengineering.
Answer: D
Diff: 2 Page Ref: 349
36) Research into managerial use of DSS and expert systems found all the following EXCEPT
- A) managers spent more of their time planning.
- B) managers saw their decision making quality enhanced.
- C) managers spent more time in the office and less in the field.
- D) managers were able to devote less of their time fighting fires.
Answer: C
Diff: 2 Page Ref: 350-351
37) Why do analytics applications have the effect of redistributing power among managers?
- A) The more information and analysis managers have, the more power they possess.
- B) Sponsoring an analytics system automatically confers power to a manager.
- C) New analytics applications change managers’ job expectations.
- D) New analytics systems lead to new budget allocations, resulting in increased power.
Answer: A
Diff: 2 Page Ref: 351
38) Services that let consumers permanently enter a profile of information along with a password and use this information repeatedly to access services at multiple sites are called
- A) consumer access applications.
- B) information collection portals.
- C) single-sign-on facilities.
- D) consumer information sign on facilities.
Answer: C
Diff: 2 Page Ref: 353
39) Which of the following is true about the furtherance of homeland security?
- A) There is a lessening of privacy issues.
- B) There is a greater need for oversight.
- C) The impetus was the need to harvest information related to financial fraud after 2001.
- D) Most people regard analytic tools as mostly ineffective in increasing security.
Answer: B
Diff: 2 Page Ref: 353
40) Which of the following is considered the economic engine of the whole analytics industry?
- A) application developers and system integrators
- B) analytics user organizations
- C) analytics industry analysts and influencers
- D) academic providers and certification industries
Answer: B
Diff: 2 Page Ref: 361
41) In the opening vignette, the combination of filed infrastructure, geospatial data, enterprise data warehouse, and analytics has enabled OG&E to manage its customer demand in such a way that it can optimize its ________ investments.
Answer: long-term
Diff: 2 Page Ref: 328
42) A critical emerging trend in analytics is the incorporation of location data. ________ data is the static location data used by these location-based analytic applications.
Answer: Geospatial
Diff: 2 Page Ref: 329
43) The surge in location-enabled services has resulted in ________ mining, the analytics of massive databases of historical and real-time streaming location information.
Answer: reality
Diff: 2 Page Ref: 333
44) The Radii mobile app collects information about the user’s habits, interests, spending patterns, and favorite locations to understand the user’s ________.
Answer: personality
Diff: 2 Page Ref: 334
45) Predictive analytics is beginning to enable development of software that is directly used by a consumer. One key concern in employing these technologies is the loss of ________.
Answer: privacy
Diff: 2 Page Ref: 337
46) Collaborative filtering is usually done by building a user-item ratings matrix where each row represents a unique user and each column gives the individual item rating made by the user. The resultant matrix is a dynamic, sparse matrix with a huge ________.
Answer: dimensionality
Diff: 2 Page Ref: 338
47) ________, which stands for Asynchronous JavaScript and XML, is an effective and efficient Web development technique for creating interactive Web applications.
Answer: Ajax
Diff: 2 Page Ref: 340
48) ________ (IaaS) promises to eliminate independent silos of data that exist in systems and infrastructure and enable sharing real-time information for emerging apps, to hide complexity, and to increase availability with virtualization.
Answer: Information-as-a-service
Diff: 3 Page Ref: 346
49) IaaS, AaaS and other ________-based offerings allow the rapid diffusion of advanced analysis tools among users, without significant investment in technology acquisition.
Answer: cloud
Diff: 2 Page Ref: 348
50) A major structural change that can occur when analytics are introduced into an organization is the creation of new organizational ________.
Answer: units
Diff: 2 Page Ref: 349
51) When an organization-wide, major restructuring is needed, the process is referred to as ________.
Answer: reengineering
Diff: 2 Page Ref: 349
52) A research study found that employees using ADS systems were more ________ with their jobs.
Answer: satisfied
Diff: 2 Page Ref: 350
53) Analytics can change the way in which many ________ are made by managers and can consequently change their jobs.
Answer: decisions
Diff: 2 Page Ref: 350
54) As face-to-face communication is often replaced by e-mail, wikis, and computerized conferencing, leadership qualities attributed to physical ________ could become less important.
Answer: appearance
Diff: 2 Page Ref: 351
55) Location information from ________ phones can be used to create profiles of user behavior and movement.
Answer: mobile/cell
Diff: 2 Page Ref: 353
56) For individual decision makers, ________ values constitute a major factor in the issue of ethical decision making.
Answer: personal
Diff: 2 Page Ref: 355
57) Firms such as Nielsen provide ________ data collection, aggregation, and distribution mechanisms and typically focus on one industry sector.
Answer: specialized
Diff: 2 Page Ref: 358
58) Possibly the biggest recent growth in analytics has been in ________ analytics, as many statistical software companies such as SAS and SPSS embraced it early on.
Answer: predictive
Diff: 2 Page Ref: 358
59) Analytics industry analysts and ________ include professional organizations that provide advice to analytics industry providers and users.
Answer: influencers
Diff: 2 Page Ref: 361
60) Southern States Cooperative used analytics to prepare the customized catalogs to suit the targeted ________ needs, resulting in better revenue generation.
Answer: customer
Diff: 2 Page Ref: 366-367
61) How does Oklahoma Gas and Electric use the Teradata platform to manage the electric grid?
Answer: Oklahoma Gas and Electric uses the Teradata platform to organize the large amounts of data that it gathers from installation of smart meters and other devices on the electronic grid at the consumer end? With Teradata’s platform, OG&E has combined its smart meter data, outage data, call center data, rate data, asset data, price signals, billing, and collections into one integrated data platform. The platform also incorporates geospatial mapping of the integrated data using the in-database geospatial analytics that add onto the OG&E’s dynamic segmentation capabilities.
Diff: 2 Page Ref: 328
62) How do the traditional location-based analytic techniques using geocoding of organizational locations and consumers hamper the organizations in understanding “true location-based” impacts?
Answer: Locations based on postal codes offer an aggregate view of a large geographic area. This poor granularity may not be able to pinpoint the growth opportunities within a region. The location of the target customers can change rapidly. An organization’s promotional campaigns might not target the right customers.
Diff: 2 Page Ref: 330
63) In what ways can communications companies use geospatial analysis to harness their data effectively?
Answer: Communication companies often generate massive amounts of data every day. The ability to analyze the data quickly with a high level of location-specific granularity can better identify the customer churn and help in formulating strategies specific to locations for increasing operational efficiency, quality of service, and revenue.
Diff: 2 Page Ref: 332
64) Describe the CabSense application used by the New York City Taxi and Limousine Commission.
Answer: Sense Networks has built a mobile application called CabSense that analyzes large amounts of data from the New York City Taxi and Limousine Commission. CabSense helps New Yorkers and visitors in finding the best corners for hailing a taxi based on the person’s location, day of the week, and time. CabSense rates the street corners on a 5-point scale by making use of machine-learning algorithms applied to the vast amounts of historical location points obtained from the pickups and drop-offs of all New York City cabs. Although the app does not give the exact location of cabs in real time, its data-crunching predictions enable people to get to a street corner that has the highest probability of finding a cab.
Diff: 3 Page Ref: 335
65) What are recommender systems, how are they developed, and how is the data used to build a recommendation system obtained?
Answer:
∙ The term recommender systems refers to a Web-based information filtering system that takes the inputs from users and then aggregates the inputs to provide recommendations for other users in their product or service selection choices.
∙ Two basic approaches that are employed in the development of recommendation systems are collaborative filtering and content filtering.
o In collaborative filtering, the recommendation system is built based on the individual user’s past behavior by keeping track of the previous history of all purchased items. This includes products, items that are viewed most often, and ratings that are given by the users to the items they purchased.
o In the content-based filtering approach, the characteristics of an item are profiled first and then content-based individual user profiles are built to store the information about the characteristics of specific items that the user has rated in the past. In the recommendation process, a comparison is made by filtering the item information from the user profile for which the user has rated positively and compares these characteristics with any new products that the user has not rated yet. Recommendations are made if there are similarities found in the item characteristics.
∙ The data necessary to build a recommendation system are collected by Web-based systems where each user is specifically asked to rate an item on a rating scale, rank the items from most favorite to least favorite, and/or ask the user to list the attributes of the items that the user likes.
Diff: 3 Page Ref: 338
66) Web 2.0 is the popular term for describing advanced Web technologies and applications. Describe four main representative characteristics of the Web 2.0 environment.
Answer:
∙ Web 2.0 has the ability to tap into the collective intelligence of users. The more users contribute, the more popular and valuable a Web 2.0 site becomes.
∙ Data is made available in new or never-intended ways. Web 2.0 data can be remixed or “mashed up,” often through Web service interfaces, much the way a dance-club DJ mixes music.
∙ Web 2.0 relies on user-generated and user-controlled content and data.
∙ Lightweight programming techniques and tools let nearly anyone act as a Web site developer.
∙ The virtual elimination of software-upgrade cycles makes everything a perpetual beta or work-in-progress and allows rapid prototyping, using the Web as an application development platform.
∙ Users can access applications entirely through a browser.
∙ An architecture of participation and digital democracy encourages users to add value to the application as they use it.
∙ A major emphasis is on social networks and computing.
∙ There is strong support for information sharing and collaboration.
∙ Web 2.0 fosters rapid and continuous creation of new business models.
Diff: 3 Page Ref: 340
67) What is mobile social network and how does it extend the reach of popular social networks?
Answer: Mobile social networking refers to social networking where members converse and connect with one another using cell phones or other mobile devices. Virtually all major social networking sites offer mobile services or apps on smartphones to access their services. The explosion of mobile Web 2.0 services and companies means that many social networks can be based from cell phones and other portable devices, extending the reach of such networks to the millions of people who lack regular or easy access to computers.
Diff: 2 Page Ref: 341
68) What is cloud computing? What is Amazon’s general approach to the cloud computing services it provides?
Answer:
∙ Wikipedia defines cloud computing as “a style of computing in which dynamically scalable and often virtualized resources are provided over the Internet. Users need not have knowledge of, experience in, or control over the technology infrastructures in the cloud that supports them.”
∙ Amazon.com has developed an impressive technology infrastructure for e- commerce as well as for business intelligence, customer relationship management, and supply chain management. It has built major data centers to manage its own operations. However, through Amazon.com’s cloud services, many other companies can employ these very same facilities to gain advantages of these technologies without having to make a similar investment. Like other cloud-computing services, a user can subscribe to any of the facilities on a pay-as-you-go basis. This model of letting someone else own the hardware and software but making use of the facilities on a pay-per-use basis is the cornerstone of cloud computing.
Diff: 2 Page Ref: 342
69) Data and text mining is a promising application of AaaS. What additional capabilities can AaaS bring to the analytic world?
Answer: It can also be used for large-scale optimization, highly-complex multi-criteria decision problems, and distributed simulation models. These prescriptive analytics require highly capable systems that can only be realized using service-based collaborative systems that can utilize large-scale computational resources.
Diff: 3 Page Ref: 348
70) Describe your understanding of the emerging term people analytics. Are there any privacy issues associated with the application?
Answer:
∙ Applications such as using sensor-embedded badges that employees wear to track their movement and predict behavior has resulted in the term people analytics. This application area combines organizational IT impact, Big Data, sensors, and has privacy concerns. One company, Sociometric Solutions, has reported several such applications of their sensor-embedded badges.
∙ People analytics creates major privacy issues. Should the companies be able to monitor their employees this intrusively? Sociometric has reported that its analytics are only reported on an aggregate basis to their clients. No individual user data is shared. They have noted that some employers want to get individual employee data, but their contract explicitly prohibits this type of sharing. In any case, sensors are leading to another level of surveillance and analytics, which poses interesting privacy, legal, and ethical questions.
Diff: 2 Page Ref: 354-355