Use Amazon OpenSearch Ingestion emigrate to Amazon OpenSearch Serverless

February 27, 2024

18

Amazon OpenSearch Serverless is an on-demand auto scaling configuration for Amazon OpenSearch Service. Since its launch, the curiosity for OpenSearch Serverless had been steadily rising. Clients choose to let the service handle its capability mechanically moderately than having to manually provision capability. Till now, prospects have needed to depend on utilizing customized code or third-party options to maneuver the information between provisioned OpenSearch Service domains and OpenSearch Serverless.

We lately launched a characteristic with Amazon OpenSearch Ingestion (OSI) to make this migration much more easy. OSI is a totally managed, serverless knowledge collector that delivers real-time log, metric, and hint knowledge to OpenSearch Service domains and OpenSearch Serverless collections.

On this put up, we define the steps to make migrate the information between provisioned OpenSearch Service domains and OpenSearch Serverless. Migration of metadata resembling safety roles and dashboard objects might be lined in one other subsequent put up.

Resolution overview

The next diagram reveals the mandatory parts for transferring knowledge between OpenSearch Service provisioned domains and OpenSearch Serverless utilizing OSI. You’ll use OSI with OpenSearch Service as supply and an OpenSearch Serverless assortment as sink.

Stipulations

Earlier than getting began, full the next steps to create the mandatory sources:

Create an AWS Id and Entry Administration (IAM) function that the OpenSearch Ingestion pipeline will assume to put in writing to the OpenSearch Serverless assortment. This function must be specified within the sts_role_arn parameter of the pipeline configuration.

Connect a permissions coverage to the function to permit it to learn knowledge from the OpenSearch Service area. The next is a pattern coverage with least privileges:

{
   "Model":"2012-10-17",
   "Assertion":[
      {
         "Effect":"Allow",
         "Action":"es:ESHttpGet",
         "Resource":[
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/",
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/_cat/indices",
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/_search",
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/_search/scroll",
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/*/_search"
         ]
      },
      {
         "Impact":"Enable",
         "Motion":"es:ESHttpPost",
         "Useful resource":[
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/*/_search/point_in_time",
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/*/_search/scroll"
         ]
      },
      {
         "Impact":"Enable",
         "Motion":"es:ESHttpDelete",
         "Useful resource":[
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/_search/point_in_time",
            "arn:aws:es:us-east-1:{account-id}:domain/{domain-name}/_search/scroll"
         ]
      }
   ]
}

Connect a permissions coverage to the function to permit it to ship knowledge to the gathering. The next is a pattern coverage with least privileges:

{
  "Model": "2012-10-17",
  "Assertion": [
    {
      "Action": [
        "aoss:BatchGetCollection",
        "aoss:APIAccessAll"
      ],
      "Impact": "Enable",
      "Useful resource": "arn:aws:aoss:{area}:{your-account-id}:assortment/{collection-id}"
    },
    {
      "Motion": [
        "aoss:CreateSecurityPolicy",
        "aoss:GetSecurityPolicy",
        "aoss:UpdateSecurityPolicy"
      ],
      "Impact": "Enable",
      "Useful resource": "*",
      "Situation": {
        "StringEquals": {
          "aoss:assortment": "{collection-name}"
        }
      }
    }
  ]
}

Configure the function to imagine the belief relationship, as follows:

{
        "Model": "2012-10-17",
        "Assertion": [
            {
                "Effect": "Allow",
                "Principal": {
                    "Service": "osis-pipelines.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }

It’s beneficial so as to add the aws:SourceAccount and aws:SourceArn situation keys to the coverage for cover towards the confused deputy downside:

"Situation": {
    "StringEquals": {
        "aws:SourceAccount": "{your-account-id}"
    },
    "ArnLike": {
        "aws:SourceArn": "arn:aws:osis:{area}:{your-account-id}:pipeline/*"
    }
}

Map the OpenSearch Ingestion area function ARN as a backend consumer (as an all_access consumer) to the area consumer. We present a simplified instance to make use of the all_access function. For manufacturing situations, make sure that to make use of a task with simply sufficient permissions to learn and write.
Create an OpenSearch Serverless assortment, which is the place knowledge might be ingested.

Affiliate an information coverage, as proven within the following code, to grant the OpenSearch Ingestion function permissions on the gathering:

[
  {
    "Rules": [
      {
        "Resource": [
          "index/collection-name/*"
        ],
        "Permission": [
          "aoss:CreateIndex",
          "aoss:UpdateIndex",
          "aoss:DescribeIndex",
          "aoss:WriteDocument",
        ],
        "ResourceType": "index"
      }
    ],
    "Principal": [
      "arn:aws:iam::{account-id}:role/pipeline-role"
    ],
    "Description": "Pipeline function entry"
  }
]

If the gathering is outlined as a VPC assortment, that you must create a community coverage and configure it within the ingestion pipeline.

Now you’re prepared to maneuver knowledge out of your provisioned area to OpenSearch Serverless.

Transfer knowledge from provisioned domains to Serverless

Setup Amazon OpenSearch Ingestion
To get began, you could have an energetic OpenSearch Service area (supply) and OpenSearch Serverless assortment (sink). Full the next steps to arrange an OpenSearch Ingestion pipeline for migration:

On the OpenSearch Service console, select Pipeline below Ingestion within the navigation pane.
Select Create a pipeline.
For Pipeline title, enter a reputation (for instance, octank-migration).
For Pipeline capability, you possibly can outline the minimal and most capability to scale up the sources. For now, you possibly can go away the default minimal as 1 and most as 4.
For Configuration Blueprint, choose AWS-OpenSearchDataMigrationPipeline.
Replace the next info for the supply:
1. Uncomment hosts and specify the endpoint of the present OpenSearch Service endpoint.
2. Uncomment distribution_version in case your supply cluster is an OpenSearch Service cluster with compatibility mode enabled; in any other case, go away it commented.
3. Uncomment indices, embrace, index_name_regex, and add an index title or sample that you simply need to migrate (for instance, octank-iot-logs-2023.11.0*).
4. Replace area below aws the place your supply area is (for instance, us-west-2).
5. Replace sts_role_arn below aws to the function that has permission to learn knowledge from the OpenSearch Service area (for instance, arn:aws:iam::111122223333:function/osis-pipeline). This function ought to be added as a backend function inside the OpenSearch Service safety roles.
Replace the next info for the sink:
1. Uncomment hosts and specify the endpoint of the present OpenSearch Serverless endpoint.
2. Replace sts_role_arn below aws to the function that has permission to put in writing knowledge into the OpenSearch Serverless assortment (for instance, arn:aws:iam::111122223333:function/osis-pipeline). This function ought to be added as a part of the information entry coverage within the OpenSearch Serverless assortment.
3. Replace the serverless flag to be true.
4. For index, you possibly can go away it as default, which is able to get the metadata from the supply index and write to the identical title within the vacation spot as of the sources. Alternatively, if you wish to have a special index title on the vacation spot, modify this worth along with your desired title.
5. For document_id, you may get the ID from the doc metadata within the supply and use the identical within the goal. Be aware that customized doc IDs are supported just for the SEARCH sort of assortment; you probably have your assortment as TIMESERIES or VECTORSEARCH, it’s best to remark this line.
Subsequent, you possibly can validate your pipeline to test the connectivity of supply and sink to verify the endpoint exists and is accessible.
For Community settings, select your most popular setting:
1. Select VPC entry and choose your VPC, subnet, and safety group to arrange the entry privately.
2. Select Public to make use of public entry. AWS recommends that you simply use a VPC endpoint for all manufacturing workloads, however this walkthrough, choose Public.
For Log Publishing Choice, you possibly can both create a brand new Amazon CloudWatch group or use an present CloudWatch group to put in writing the ingestion logs. This supplies entry to details about errors and warnings raised through the operation, which may help throughout troubleshooting. For this walkthrough, select Create new group.
Select Subsequent, and confirm the small print you specified in your pipeline settings.
Select Create pipeline.

It ought to take a few minutes to create the ingestion pipeline.
The next graphic offers a fast demonstration of making the OpenSearch Ingestion pipeline through the previous steps.

Confirm ingested knowledge within the goal OpenSearch Serverless assortment

After the pipeline is created and energetic, log in to OpenSearch Dashboards in your OpenSearch Serverless assortment and run the next command to checklist the indexes:

GET _cat/indices?v

The next graphic offers a fast demonstration of itemizing the indexes earlier than and after the pipeline turns into energetic.

Conclusion

On this put up, we noticed how OpenSearch Ingestion can ingest knowledge into an OpenSearch Serverless assortment with out the necessity to use the third-party options. With minimal knowledge producer configuration, it mechanically ingested knowledge to the gathering. OSI additionally permits you to rework or reindex the information from ES7.x model earlier than ingestion to an OpenSearch Service area or OpenSearch Serverless assortment. OSI eliminates the necessity to provision, scale, or handle servers. AWS affords varied sources so that you can rapidly begin constructing pipelines utilizing OpenSearch Ingestion. You should utilize varied built-in pipeline integrations to rapidly ingest knowledge from Amazon DynamoDB, Amazon Managed Streaming for Apache Kafka (Amazon MSK), Amazon Safety Lake, Fluent Bit, and plenty of extra. The next OpenSearch Ingestion blueprints allow you to construct knowledge pipelines with minimal configuration modifications.

Concerning the Authors

Muthu Pitchaimani is a Search Specialist with Amazon OpenSearch Service. He builds large-scale search functions and options. Muthu is within the subjects of networking and safety, and relies out of Austin, Texas.

Prashant Agrawal is a Sr. Search Specialist Options Architect with Amazon OpenSearch Service. He works carefully with prospects to assist them migrate their workloads to the cloud and helps present prospects fine-tune their clusters to attain higher efficiency and save on price. Earlier than becoming a member of AWS, he helped varied prospects use OpenSearch and Elasticsearch for his or her search and log analytics use circumstances. When not working, you could find him touring and exploring new locations. In brief, he likes doing Eat → Journey → Repeat.

Rahul Sharma is a Technical Account Supervisor at Amazon Internet Providers. He’s passionate in regards to the knowledge applied sciences that assist leverage knowledge as a strategic asset and relies out of NY city, New York.

Use Amazon OpenSearch Ingestion emigrate to Amazon OpenSearch Serverless

Resolution overview

Stipulations

Transfer knowledge from provisioned domains to Serverless

Confirm ingested knowledge within the goal OpenSearch Serverless assortment

Conclusion

Concerning the Authors

Related Articles

How To Create Advertising Resilience

How Lengthy Does It Take For Schema To Rank

Elementor Rolls Out WordPress AI Website Planner

ABOUT US