{"id":38787,"date":"2023-08-25T07:42:34","date_gmt":"2023-08-25T07:42:34","guid":{"rendered":"https:\/\/www.devopsschool.com\/blog\/?p=38787"},"modified":"2023-09-22T07:34:04","modified_gmt":"2023-09-22T07:34:04","slug":"what-is-apache-beam-and-use-cases-of-apache-beam","status":"publish","type":"post","link":"https:\/\/www.devopsschool.com\/blog\/what-is-apache-beam-and-use-cases-of-apache-beam\/","title":{"rendered":"What is Apache Beam and use cases of Apache Beam?"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">What is Apache Beam?<\/h2>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-638-1024x551.png\" alt=\"\" class=\"wp-image-38798\" style=\"width:725px;height:390px\" width=\"725\" height=\"390\" srcset=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-638-1024x551.png 1024w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-638-300x161.png 300w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-638-768x413.png 768w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-638.png 1280w\" sizes=\"auto, (max-width: 725px) 100vw, 725px\" \/><figcaption class=\"wp-element-caption\"><strong><em>What is Apache Beam<\/em><\/strong><\/figcaption><\/figure>\n<\/div>\n\n\n<p>Apache Beam is an open-source unified programming model and a set of APIs for building batch and streaming data processing pipelines. It provides a way to define data processing tasks that can run on various distributed processing backends, such as Apache Spark, Apache Flink, Google Cloud Dataflow, and more. The goal of Apache Beam is to provide a portable and consistent way to express data processing pipelines regardless of the underlying execution engine.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Top 10 use cases of Apache Beam:<\/h2>\n\n\n\n<p>Here are the top 10 use cases of Apache Beam:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Real-Time Analytics:<\/strong> Apache Beam is used to process and analyze streaming data in real time. It can be applied in scenarios like monitoring social media feeds, tracking user activities, and generating real-time insights.<\/li>\n\n\n\n<li><strong>Batch ETL (Extract, Transform, Load):<\/strong> Beam enables efficient extraction, transformation, and loading of large volumes of batch data. It&#8217;s commonly used to preprocess and clean data before storing it in data warehouses or databases.<\/li>\n\n\n\n<li><strong>Event Time Processing:<\/strong> Apache Beam provides features for event time processing, which is critical for scenarios where events occur at different times but are processed in the correct order.<\/li>\n\n\n\n<li><strong>IoT Data Processing:<\/strong> With the rise of Internet of Things (IoT), Beam can be used to ingest and process data from various sensors and devices in real time.<\/li>\n\n\n\n<li><strong>Clickstream Analysis:<\/strong> For websites and applications, Beam can help analyze clickstream data to understand user behavior and make data-driven decisions to improve user experiences.<\/li>\n\n\n\n<li><strong>Fraud Detection:<\/strong> Beam can be used to detect patterns and anomalies in streaming data, helping to identify potential fraudulent activities in real time.<\/li>\n\n\n\n<li><strong>Financial Data Processing:<\/strong> Beam is applicable in the financial industry for processing and analyzing stock market data, transactions, and trading activities.<\/li>\n\n\n\n<li><strong>Recommendation Systems:<\/strong> In e-commerce and entertainment, Beam can process user interaction data to generate personalized recommendations for products, movies, music, and more.<\/li>\n\n\n\n<li><strong>Data Enrichment:<\/strong> Beam can enrich data by joining streams or batches of data with external datasets, providing additional context for analysis.<\/li>\n\n\n\n<li><strong>Machine Learning Pipelines:<\/strong> Beam can be used to preprocess and transform data before feeding it into machine learning models. It supports data preparation steps like feature engineering and normalization.<\/li>\n\n\n\n<li><strong>Log Analysis:<\/strong> For system monitoring and troubleshooting, Beam can process log data in real time, identifying issues and anomalies to ensure smooth operations.<\/li>\n\n\n\n<li><strong>Supply Chain Optimization:<\/strong> In logistics and supply chain management, Beam can process data related to shipments, inventory, and demand to optimize routes and inventory levels.<\/li>\n<\/ol>\n\n\n\n<p>These are just a few examples of how Apache Beam can be applied to various data processing scenarios. The key advantage of using Beam is its portability across different processing engines, allowing developers to write pipelines once and run them on different platforms without major code changes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What are the feature of Apache Beam?<\/h2>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-640-1024x406.png\" alt=\"\" class=\"wp-image-38800\" style=\"width:815px;height:323px\" width=\"815\" height=\"323\" srcset=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-640-1024x406.png 1024w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-640-300x119.png 300w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-640-768x304.png 768w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-640.png 1277w\" sizes=\"auto, (max-width: 815px) 100vw, 815px\" \/><figcaption class=\"wp-element-caption\"><strong><em>Feature of Apache Beam<\/em><\/strong><\/figcaption><\/figure>\n<\/div>\n\n\n<p>Apache Beam offers a set of features that enable developers to build data processing pipelines that are portable, scalable, and expressive. It provides an abstraction layer that allows users to define data processing tasks without being tied to a specific execution engine. Here are some key features of Apache Beam:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Unified Model:<\/strong> Apache Beam provides a unified programming model for both batch and stream processing. This allows developers to write code that works for both scenarios, promoting code reuse and simplifying development.<\/li>\n\n\n\n<li><strong>Portability:<\/strong> Beam pipelines can be executed on various processing engines, such as Apache Spark, Apache Flink, Google Cloud Dataflow, and more. This enables users to choose the most suitable execution engine for their use case.<\/li>\n\n\n\n<li><strong>Parallel Processing:<\/strong> Beam processes data in parallel, taking advantage of the underlying processing engine&#8217;s capabilities to distribute work across multiple nodes and cores, improving performance.<\/li>\n\n\n\n<li><strong>Event Time Processing:<\/strong> Beam supports event time processing, allowing developers to handle out-of-order data based on event timestamps. This is crucial for scenarios where data arrives at different times.<\/li>\n\n\n\n<li><strong>Windowing:<\/strong> Beam provides windowing capabilities for managing and processing data within specific time intervals or windows, facilitating tasks like sessionization and aggregations.<\/li>\n\n\n\n<li><strong>Exactly-Once Processing:<\/strong> Beam supports exactly-once processing semantics, ensuring that each piece of data is processed exactly once, even in the presence of failures.<\/li>\n\n\n\n<li><strong>Dynamic Scaling:<\/strong> Beam pipelines can dynamically scale up or down based on the incoming data volume, enabling efficient resource utilization.<\/li>\n\n\n\n<li><strong>Backpressure Handling:<\/strong> Beam provides mechanisms to handle backpressure, which occurs when the rate of data production exceeds the rate of data processing. This ensures stability and prevents resource exhaustion.<\/li>\n\n\n\n<li><strong>Extensibility:<\/strong> Beam allows users to create custom transformations, sources, and sinks, enabling integration with various data sources, APIs, and storage systems.<\/li>\n\n\n\n<li><strong>State Management:<\/strong> Beam supports distributed state management, allowing developers to maintain and use stateful information across different processing stages.<\/li>\n\n\n\n<li><strong>Checkpointing:<\/strong> For stream processing, Beam supports checkpointing, which enables fault tolerance by periodically saving the pipeline&#8217;s state to a durable storage system.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">How Apache Beam Works and Architecture?<\/h2>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-641-1024x433.png\" alt=\"\" class=\"wp-image-38801\" style=\"width:843px;height:356px\" width=\"843\" height=\"356\" srcset=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-641-1024x433.png 1024w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-641-300x127.png 300w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-641-768x325.png 768w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-641-1536x650.png 1536w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-641.png 1756w\" sizes=\"auto, (max-width: 843px) 100vw, 843px\" \/><figcaption class=\"wp-element-caption\"><strong><em>Apache Beam Works and Architecture<\/em><\/strong><\/figcaption><\/figure>\n<\/div>\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Pipeline Definition:<\/strong> Developers define their data processing pipeline using the Apache Beam SDK. This includes specifying transformations, data sources, sinks, and other processing steps.<\/li>\n\n\n\n<li><strong>Graph Representation:<\/strong> The pipeline definition is transformed into a directed acyclic graph (DAG) that represents the sequence of transformations and data flows.<\/li>\n\n\n\n<li><strong>Execution Engine:<\/strong> The user specifies the execution engine where the pipeline will run. Apache Beam supports various execution engines like Apache Spark, Apache Flink, Google Cloud Dataflow, and more.<\/li>\n\n\n\n<li><strong>Translation Layer:<\/strong> Apache Beam translates the pipeline&#8217;s logical representation into executable code that is compatible with the chosen execution engine.<\/li>\n\n\n\n<li><strong>Distributed Processing:<\/strong> The execution engine distributes the pipeline tasks across a cluster of machines, performing parallel processing and optimizations.<\/li>\n\n\n\n<li><strong>Data Processing:<\/strong> The execution engine processes the data according to the defined transformations and logic, performing tasks like mapping, filtering, aggregation, and more.<\/li>\n\n\n\n<li><strong>Windowing and Time Handling:<\/strong> If windowing is used, data is organized into windows based on time intervals. Event time processing and windowing are applied to ensure accurate data processing.<\/li>\n\n\n\n<li><strong>Output:<\/strong> Processed data is sent to the specified output sinks, which could be databases, storage systems, APIs, or other data destinations.<\/li>\n\n\n\n<li><strong>Fault Tolerance:<\/strong> Apache Beam&#8217;s execution engines ensure fault tolerance by managing checkpoints, maintaining state, and handling failures gracefully.<\/li>\n\n\n\n<li><strong>Scaling:<\/strong> The execution engine can dynamically scale the pipeline based on incoming data volume and available resources, ensuring efficient resource utilization.<\/li>\n\n\n\n<li><strong>Completion and Cleanup:<\/strong> Once the pipeline completes, resources are released, and cleanup tasks are performed.<\/li>\n\n\n\n<li><strong>Monitoring and Metrics:<\/strong> Apache Beam provides monitoring and metrics capabilities to track the progress, performance, and health of the pipeline.<\/li>\n<\/ol>\n\n\n\n<p>Overall, Apache Beam&#8217;s architecture abstracts the complexities of distributed data processing and provides a consistent model for building data pipelines that can be executed on various processing engines. This promotes code portability, scalability, and ease of development across different data processing scenarios.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How to Install Apache Beam?<\/h2>\n\n\n\n<p>To install Apache Beam, you need to have the following:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Java 8 or later<\/li>\n\n\n\n<li>Python 3.6 or later<\/li>\n\n\n\n<li>A compatible runner<\/li>\n<\/ul>\n\n\n\n<p>The following are the detailed steps on how to install Apache Beam on different operating systems:<\/p>\n\n\n\n<p><strong>Windows<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Install Java 8 or later. You can download the Java installer from the Oracle website.<\/li>\n\n\n\n<li>Install Python 3.6 or later. You can go to their website to download the Python installer.<\/li>\n\n\n\n<li>Install the Apache Beam SDK for Python. You can perform this by running the below command in a terminal window:<\/li>\n<\/ol>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"CSS\" data-shcb-language-slug=\"css\"><span><code class=\"hljs language-css\">  <span class=\"hljs-selector-tag\">pip<\/span> <span class=\"hljs-selector-tag\">install<\/span> <span class=\"hljs-selector-tag\">apache-beam<\/span><span class=\"hljs-selector-attr\">&#91;gcp]<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">CSS<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">css<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p><strong>macOS<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Install Java 8 or later. You can download the Java installer from the Oracle website.<\/li>\n\n\n\n<li>Install Python 3.6 or later. You can go to their website to download the Python installer.<\/li>\n\n\n\n<li>Install the Apache Beam SDK for Python. You can perform this by running the below command in a terminal window:<\/li>\n<\/ol>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-2\" data-shcb-language-name=\"CSS\" data-shcb-language-slug=\"css\"><span><code class=\"hljs language-css\">  <span class=\"hljs-selector-tag\">pip<\/span> <span class=\"hljs-selector-tag\">install<\/span> <span class=\"hljs-selector-tag\">apache-beam<\/span><span class=\"hljs-selector-attr\">&#91;gcp]<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-2\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">CSS<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">css<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p><strong>Linux<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Install Java 8 or later. You can download the Java installer from the Oracle website.<\/li>\n\n\n\n<li>Install Python 3.6 or later. You can go to their website to download the Python installer.<\/li>\n\n\n\n<li>Install the Apache Beam SDK for Python. You can perform this by running the below command in a terminal window:<\/li>\n<\/ol>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-3\" data-shcb-language-name=\"CSS\" data-shcb-language-slug=\"css\"><span><code class=\"hljs language-css\">  <span class=\"hljs-selector-tag\">pip<\/span> <span class=\"hljs-selector-tag\">install<\/span> <span class=\"hljs-selector-tag\">apache-beam<\/span><span class=\"hljs-selector-attr\">&#91;gcp]<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-3\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">CSS<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">css<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Once you have installed Apache Beam, you can start writing pipelines. You can find more information about writing pipelines in the Apache Beam documentation.<\/p>\n\n\n\n<p>Here are some of the benefits of using Apache Beam:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>It is a unified model for data processing pipelines.<\/li>\n\n\n\n<li>It can be used to process data on a variety of platforms, including Apache Spark, Apache Flink, and Google Cloud Dataflow.<\/li>\n\n\n\n<li>It provides a variety of features for data processing, such as batch processing, streaming processing, and machine learning.<\/li>\n\n\n\n<li>It is free to apply and open-source.<\/li>\n<\/ul>\n\n\n\n<p>Here are some of the drawbacks of using Apache Beam:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>It can be complex to learn and use.<\/li>\n\n\n\n<li>It can be slow for some applications, especially those that use a lot of data.<\/li>\n\n\n\n<li>It is not as popular as some other data processing frameworks, such as Spark and Flink.<\/li>\n<\/ul>\n\n\n\n<p>Overall, Apache Beam is a powerful and versatile data processing framework that can be used to process data on a variety of platforms. It is a good choice for developers who want to build scalable, reliable, and efficient data processing pipelines.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Basic Tutorials of Apache Beam: Getting Started<\/h2>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-643.png\" alt=\"\" class=\"wp-image-38803\" style=\"width:696px;height:392px\" width=\"696\" height=\"392\" srcset=\"https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-643.png 838w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-643-300x169.png 300w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-643-768x433.png 768w, https:\/\/www.devopsschool.com\/blog\/wp-content\/uploads\/2023\/08\/image-643-355x199.png 355w\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" \/><figcaption class=\"wp-element-caption\"><strong><em>Basic Tutorials of Apache Beam<\/em><\/strong><\/figcaption><\/figure>\n<\/div>\n\n\n<p>The following are the steps of basic tutorials of Apache Beam:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Create a new Apache Beam project in IntelliJ IDEA<\/strong><\/li>\n<\/ol>\n\n\n\n<p>To create a new Apache Beam project in IntelliJ IDEA, you can follow these steps:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-4\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript\">  <span class=\"hljs-number\">1.<\/span> Open IntelliJ IDEA.\n  <span class=\"hljs-number\">2.<\/span> Click on the <span class=\"hljs-string\">\"Create New Project\"<\/span> button.\n  <span class=\"hljs-number\">3.<\/span> In the <span class=\"hljs-string\">\"New Project\"<\/span> dialog box, select the <span class=\"hljs-string\">\"Project\"<\/span> project type and click on\n     the <span class=\"hljs-string\">\"Next\"<\/span> button.\n  <span class=\"hljs-number\">4.<\/span> In the <span class=\"hljs-string\">\"Choose a project SDK\"<\/span> dialog box, select the <span class=\"hljs-string\">\"Java SDK 1.8\"<\/span> option and\n     click on the <span class=\"hljs-string\">\"Next\"<\/span> button.\n  <span class=\"hljs-number\">5.<\/span> In the <span class=\"hljs-string\">\"Configure Project\"<\/span> dialog box, enter a name <span class=\"hljs-keyword\">for<\/span> your project and click\n     on the <span class=\"hljs-string\">\"Finish\"<\/span> button.<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-4\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<ol class=\"wp-block-list\" start=\"2\">\n<li><strong>Add the Apache Beam SDK to your project<\/strong><\/li>\n<\/ol>\n\n\n\n<p>To add the Apache Beam SDK to your project, you can follow these steps:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-5\" data-shcb-language-name=\"PHP\" data-shcb-language-slug=\"php\"><span><code class=\"hljs language-php\"> <span class=\"hljs-number\">1.<\/span> Unlock the project in IntelliJ IDEA.\n <span class=\"hljs-number\">2.<\/span> In the project window, right-click on the <span class=\"hljs-string\">\"pom.xml\"<\/span> file <span class=\"hljs-keyword\">and<\/span> select the <span class=\"hljs-string\">\"Open\n    Module Settings\"<\/span> menu item.\n <span class=\"hljs-number\">3.<\/span> Select the <span class=\"hljs-string\">\"Dependencies\"<\/span> tab in the <span class=\"hljs-string\">\"Module Settings\"<\/span> dialog box.\n <span class=\"hljs-number\">4.<\/span> Click on the <span class=\"hljs-string\">\"+\"<\/span> button <span class=\"hljs-keyword\">and<\/span> select the <span class=\"hljs-string\">\"Add Library\"<\/span> menu item.\n <span class=\"hljs-number\">5.<\/span> In the <span class=\"hljs-string\">\"Add Library\"<\/span> dialog box, select the <span class=\"hljs-string\">\"Maven\"<\/span> tab.\n <span class=\"hljs-number\">6.<\/span> Enter <span class=\"hljs-string\">\"org.apache.beam\"<\/span> in the <span class=\"hljs-string\">\"Group ID\"<\/span> field.\n <span class=\"hljs-number\">7.<\/span> Enter <span class=\"hljs-string\">\"beam-sdks-python-io\"<\/span> in the <span class=\"hljs-string\">\"Artifact ID\"<\/span> field.\n <span class=\"hljs-number\">8.<\/span> Click on the <span class=\"hljs-string\">\"OK\"<\/span> button.<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-5\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">PHP<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">php<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<ol class=\"wp-block-list\" start=\"3\">\n<li><strong>Write a simple Apache Beam pipeline<\/strong><\/li>\n<\/ol>\n\n\n\n<p>A simple Apache Beam pipeline can be written in a few lines of code. The following is an example of a simple pipeline that reads a file and prints the contents of the file to the console:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-6\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript\"><span class=\"hljs-keyword\">import<\/span> apache_beam <span class=\"hljs-keyword\">as<\/span> beam\n\n<span class=\"hljs-keyword\">with<\/span> beam.Pipeline() <span class=\"hljs-keyword\">as<\/span> pipeline:\n  (pipeline\n    | <span class=\"hljs-string\">'Read data'<\/span> &gt;&gt; beam.io.ReadFromText(<span class=\"hljs-string\">'data.txt'<\/span>)\n    | <span class=\"hljs-string\">'Print data'<\/span> &gt;&gt; beam.io.WriteToText(<span class=\"hljs-string\">'output.txt'<\/span>))\n\npipeline.run()<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-6\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<ol class=\"wp-block-list\" start=\"4\">\n<li><strong>Run your pipeline<\/strong><\/li>\n<\/ol>\n\n\n\n<p>To run your pipeline, you can use the <code>run()<\/code> method of the <code>Pipeline<\/code> object. The following code runs the pipeline that was created in the previous step:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-7\" data-shcb-language-name=\"CSS\" data-shcb-language-slug=\"css\"><span><code class=\"hljs language-css\"><span class=\"hljs-selector-tag\">pipeline<\/span><span class=\"hljs-selector-class\">.run<\/span>()<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-7\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">CSS<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">css<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>","protected":false},"excerpt":{"rendered":"<p>What is Apache Beam? Apache Beam is an open-source unified programming model and a set of APIs for building batch and streaming data processing pipelines. It provides a way to define data processing tasks that can run on various distributed processing backends, such as Apache Spark, Apache Flink, Google Cloud Dataflow, and more. The goal&#8230;<\/p>\n","protected":false},"author":25,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","_joinchat":[],"footnotes":""},"categories":[2],"tags":[],"class_list":["post-38787","post","type-post","status-publish","format-standard","hentry","category-uncategorised"],"_links":{"self":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/38787","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/users\/25"}],"replies":[{"embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/comments?post=38787"}],"version-history":[{"count":2,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/38787\/revisions"}],"predecessor-version":[{"id":38805,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/38787\/revisions\/38805"}],"wp:attachment":[{"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/media?parent=38787"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/categories?post=38787"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.devopsschool.com\/blog\/wp-json\/wp\/v2\/tags?post=38787"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}