Further processing options

Big Data Analytics: 6th International Conference, BDA 2018, Warangal, India, December 18-21, 2018, Proceedings.

Bibliographic Details
Authors and Corporations: Mondal, Anirban., Gupta, Himanshu., Srivastava, Jaideep., Reddy, P. Krishna., Somayajulu, D. V. L. N.
Other Authors: Gupta, Himanshu. , Srivastava, Jaideep. , Reddy, P. Krishna. , Somayajulu, D. V. L. N.
Type of Resource: E-Book
Language: English
published:
Cham : Springer International Publishing AG, 2018.
©2018.
Series: Lecture Notes in Computer Science Ser.
Subjects:
Source: Ebook Central
ISBN: 9783030047801
Table of Contents:
  • Intro
  • Preface
  • Organization
  • Contents
  • Big Data Analytics: Vision and Perspectives
  • Fault Tolerant Data Stream Processing in Cooperation with OLTP Engine
  • 1 Introduction
  • 2 Fault-Tolerance and State Management in Data Stream Processing
  • 2.1 Motivation
  • 2.2 Fault Tolerance in Data Stream Processing
  • 2.3 State Management in Data Stream Processing
  • 2.4 Literature Survey
  • 2.5 Open Problems
  • 3 Efficient and Flexible Data Stream Query Processing
  • 3.1 Approximate Query Processing for Data Stream Processing
  • 3.2 Query Processing on Uncertain/Probabilistic Data Streams
  • 4 HTAP Technology and Data Stream Management
  • 5 Objectives and Research Issues of Our Project
  • 5.1 Objectives
  • 5.2 Research Issues
  • 6 Conclusions
  • References
  • Blockchain-Powered Big Data Analytics Platform
  • 1 Introduction
  • 2 The Case for AI-Driven Blockchain-Enabled Platform for Insurance Industry
  • 3 Architecture of a Blockchain-Powered Big Data Platform
  • 4 Scalable Blockchain Data Management
  • 4.1 Technical Challenges
  • 4.2 Scaling Blockchain Through Data Partitioning Technique
  • 4.3 Transaction Processing Workflow
  • 5 Introducing Federated AI for Insurance Marketplace Through Blockchain
  • 5.1 Technical Challenges
  • 5.2 Initial Solution
  • 6 Blockchain Data Analytics
  • 6.1 Improving Customer Satisfaction
  • 6.2 Enhancing Business Operations
  • 7 Conclusion
  • References
  • Humble Data Management to Big Data Analytics/Science: A Retrospective Stroll
  • 1 Introduction
  • 2 Traditional Data Management
  • 2.1 Our Contributions
  • 3 Data Warehouses
  • 3.1 Our Contributions
  • 4 Event and Stream Data Processing
  • 4.1 Our Contributions
  • 5 Data Mining or Knowledge Discovery in Databases
  • 5.1 Our Contributions
  • 6 Approaches to Big Data Analytics/Science
  • 6.1 Data and Processing Scalability Using Map/Reduce.
  • 6.2 Stream-Based Video Situation Analysis
  • 6.3 Modeling and Analyzing Complex Data Using Multiplexes
  • 7 Conclusions
  • References
  • Fusion of Game Theory and Big Data for AI Applications
  • 1 Introduction
  • 2 Game Theory Overview
  • 2.1 Equilibrium
  • 2.2 Mechanism Design - A Reverse Game Engineering Aka Incentive Engineering
  • 3 Information Markets
  • 4 Stackelberg Security Games (SSGs)
  • 5 Trading Agents
  • 6 Internet Ad Auction
  • 7 Conclusions
  • References
  • Financial Data Analytics and Data Streams
  • Distributed Financial Calculation Framework on Cloud Computing Environment
  • Abstract
  • 1 Introduction
  • 2 Preliminaries and Background Information
  • 3 Problem Statement, Related Work, Proposed Solution
  • 3.1 Problem Statement
  • 3.2 Related Work
  • 3.3 Proposed Solution
  • 4 Implementation
  • 4.1 Current Architecture
  • 4.2 Proposed Architecture
  • 5 Results
  • 6 Conclusion
  • 7 Future Work
  • References
  • Testing Concept Drift Detection Technique on Data Stream
  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Problem Formulation
  • 4 Concept Drift Detection Procedure
  • 5 Experiments
  • 5.1 SEA
  • 5.2 Hyperplane
  • 6 Future Work
  • 7 Conclusion
  • References
  • Homogenous Ensemble of Time-Series Models for Indian Stock Market
  • Abstract
  • 1 Introduction
  • 2 Problem Statement
  • 3 Proposed Method
  • 3.1 Univariate Analysis
  • 3.2 Multivariate Analysis
  • 3.3 Ensemble Technique
  • 4 Results and Discussions
  • 5 Conclusion
  • References
  • Improving Time Series Forecasting Using Mathematical and Deep Learning Models
  • Abstract
  • 1 Introduction
  • 1.1 Time Series and Stationarity
  • 1.2 Techniques to Make Time Series Stationary
  • 2 Related Work
  • 3 Methodology: Case Study - Wikipedia Web Traffic
  • 3.1 Dataset
  • 3.2 Implementation
  • 3.3 Time Series Log Transformation
  • 3.4 Differencing
  • 3.5 Decomposition.
  • 3.6 Augmented Dickey Fuller Test - Stationarity Test
  • 3.7 Auto Regressive Model
  • 3.8 Moving Average Model
  • 3.9 Auto Regressive Integrated Moving Average Model
  • 3.10 Long Short-Term Memory (LSTM) Network
  • 3.11 Testing of Data
  • 4 Result
  • 5 Discussion
  • 6 Conclusion/Future Scope
  • References
  • Emerging Technologies and Opportunities for Innovation in Financial Data Analytics: A Perspective
  • 1 Introduction
  • 2 Macro-environmental Trends in Financial Services
  • 2.1 Political/Legal Trends
  • 2.2 Economic Trends
  • 2.3 Socio-Cultural/Demographic Trends
  • 3 Applications and Use-Cases
  • 3.1 Blockchain in Financial Services
  • 3.2 AI in Financial Services
  • 3.3 Machine Learning in Financial Services
  • 4 Research Challenges and the Way Forward
  • References
  • Web and Social Media Data
  • Design of the Cogno Web Observatory for Characterizing Online Social Cognition
  • 1 Introduction
  • 2 Related Literature
  • 3 Modeling Social Cognition
  • 4 Identifying Opinion Drivers on Social Media
  • 5 Conclusions and Future Work
  • References
  • Automated Credibility Assessment of Web Page Based on Genre
  • 1 Introduction
  • 2 Approach
  • 2.1 Survey
  • 2.2 Survey Results and Analysis
  • 2.3 Credibility Assessment
  • 3 Scoring, Results and Validation
  • 3.1 WEBCred
  • 3.2 Validation
  • 4 Conclusions and Future Work
  • References
  • CbI: Improving Credibility of User-Generated Content on Facebook
  • 1 Introduction
  • 2 Related Work
  • 2.1 Credibility Assessment of Content on Twitter
  • 2.2 Credibility Assessment of Content on Facebook
  • 3 Credibility Assessment of User-Generated Content on Facebook
  • 3.1 Proposed Credibility Assessment Model
  • 3.2 Credibility Assessment Tools
  • 4 Data Collection and Labeled Dataset Creation
  • 4.1 Data Collection
  • 4.2 Labeled Dataset Creation
  • 5 Automatic Credibility Assessment.
  • 5.1 Facebook's Current Techniques to Identify Fake News
  • 5.2 Credibility Assessment Features
  • 5.3 Classification Algorithms
  • 6 Conclusion, Limitations, and Future Work
  • References
  • A Parallel Approach to Detect Communities in Evolving Networks
  • 1 Introduction
  • 2 Prior Research
  • 3 PcDEN: An Incremental Parallel Community Detection Approach
  • 3.1 A New Similarity Measure for Parallel Community Detection
  • 3.2 A Proposed Parallel Approach
  • 3.3 Community Finding in Each Worker
  • 3.4 Finding High and Low Degree Nodes
  • 3.5 Selection of High Degree Nodes in a Community per Worker
  • 3.6 Selection of Low Degree Nodes in a Community per Worker
  • 3.7 Merging Communities from Different Workers
  • 4 Performance Evaluation
  • 4.1 Dataset Used
  • 4.2 Experimental Results
  • 4.3 Scalability
  • 5 Conclusion
  • References
  • Modeling Sparse and Evolving Data
  • Abstract
  • 1 Introduction
  • 2 State-of-the-Art
  • 3 Modeling Sparse and Evolving Data
  • 3.1 Incapacitating Sparseness
  • 3.2 Building Translation Layer
  • 3.3 Handling Complex Queries
  • 4 MTEAV: Extended Functionality
  • 4.1 Need of Extension
  • 4.2 Abstracting Modeling Details
  • 4.3 Flexibility to Evolve Schema
  • 5 Experiments and Results
  • 5.1 MTEAV Versus Existing Models
  • 5.2 Effect of Sparseness
  • 6 Conclusions
  • References
  • Big Data Systems and Frameworks
  • Polystore Data Management Systems for Managing Scientific Data-sets in Big Data Archives
  • Abstract
  • 1 Introduction
  • 2 Palomar Transient Factory (PTF) Data Repository
  • 2.1 PTF Data Processing Requirements
  • 2.2 IRSA Cloud Service and Archive
  • 2.3 Querying Over Cloud-Data Resources of IRSA
  • 3 Polystore Data Management System
  • 3.1 Common Architecture of Polystore Database System
  • 3.2 Need for Polystore Databases
  • 3.3 Query Using Polystore Data Management Systems.
  • 4 Polystore Data Management System for PTF Data Archives
  • 4.1 Data Integration Processes and Data Independence
  • 5 Summary and Conclusions
  • References
  • MPP SQL Query Optimization with RTCG
  • 1 Introduction
  • 2 Background
  • 3 Related Work in Databases
  • 4 dbX Architecture
  • 5 Code Generation: Model
  • 5.1 SQL Opportunities
  • 5.2 Modern Hardware
  • 5.3 Optimization Techniques
  • 6 Code Generation: Meta-Questions
  • 6.1 To RTCG or Not?
  • 6.2 When and Where to RTCG?
  • 6.3 What if SQL Is Machine Generated?
  • 6.4 How to Compile?
  • 6.5 Why Not libJIT or LLVM?
  • 6.6 Who Takes the Kill?
  • 7 Experimental Evaluation
  • 7.1 Microbenchmarks
  • 7.2 Macrobenchmarks: TPC-H &amp
  • TPC-DS
  • 8 Discussion and Conclusion
  • References
  • Big Data Analytics Framework for Spatial Data
  • Abstract
  • 1 Introduction
  • 1.1 Novelty and Contributions
  • 2 Related Work
  • 2.1 Traditional Databases for Spatial Data
  • 2.2 NoSQL Databases for Spatial Data
  • 2.2.1 Cassandra for Spatial Data
  • 2.3 Big Data Computational Frameworks for Spatial Data
  • 2.3.1 Spark for Spatial Data
  • 2.4 Shortcomings of the Existing Systems for Big Spatial Data
  • 3 Integration of Big Data Frameworks - Spark and Cassandra
  • 3.1 Spark-Cassandra Connector
  • 3.1.1 Architecture Implementation
  • 4 Proposed Framework
  • 4.1 Spatial Data Storage Layer
  • 4.1.1 Data Loading
  • 4.1.2 Data Storage
  • 4.1.3 Align Spark-Cassandra Distribution
  • 4.1.4 Associate Spatial Index
  • 4.1.5 Store Spatial Dataframe into Cassandra
  • 4.2 Spark Core Layer
  • 4.3 Spatial Data Processing Layer
  • 4.3.1 Proximity Search
  • 4.3.2 KNN Search
  • 4.3.3 Point Query
  • 4.4 Application Layer
  • 5 Experimental Results and Discussion
  • 5.1 Experimental Setup
  • 5.2 Description of Dataset
  • 5.3 Results
  • 5.3.1 Load Data into Cassandra
  • 5.3.2 Establish Analytics Pipeline
  • 5.3.3 Attribute Query Analysis.
  • 5.3.4 Point Query Analysis.