Achieve insights from historic location knowledge utilizing Amazon Location Service and AWS analytics companies

March 13, 2024

16

Many organizations all over the world depend on using bodily property, resembling automobiles, to ship a service to their end-customers. By monitoring these property in actual time and storing the outcomes, asset house owners can derive precious insights on how their property are getting used to constantly ship enterprise enhancements and plan for future adjustments. For instance, a supply firm working a fleet of automobiles might have to determine the influence from native coverage adjustments outdoors of their management, such because the introduced enlargement of an Extremely-Low Emission Zone (ULEZ). By combining historic car location knowledge with data from different sources, the corporate can devise empirical approaches for higher decision-making. For instance, the corporate’s procurement workforce can use this data to make selections about which automobiles to prioritize for substitute earlier than coverage adjustments go into impact.

Builders can use the help in Amazon Location Service for publishing machine place updates to Amazon EventBridge to construct a near-real-time knowledge pipeline that shops places of tracked property in Amazon Easy Storage Service (Amazon S3). Moreover, you should utilize AWS Lambda to counterpoint incoming location knowledge with knowledge from different sources, resembling an Amazon DynamoDB desk containing car upkeep particulars. Then an information analyst can use the geospatial querying capabilities of Amazon Athena to realize insights, such because the variety of days their automobiles have operated within the proposed boundaries of an expanded ULEZ. As a result of automobiles that don’t meet ULEZ emissions requirements are subjected to a each day cost to function throughout the zone, you should utilize the situation knowledge, together with upkeep knowledge resembling age of the car, present mileage, and present emissions requirements to estimate the quantity the corporate must spend on each day charges.

This submit exhibits how you should utilize Amazon Location, EventBridge, Lambda, Amazon Knowledge Firehose, and Amazon S3 to construct a location-aware knowledge pipeline, and use this knowledge to drive significant insights utilizing AWS Glue and Athena.

Overview of answer

This can be a absolutely serverless answer for location-based asset administration. The answer consists of the next interfaces:

IoT or cell utility – A cell utility or an Web of Issues (IoT) machine permits the monitoring of an organization car whereas it’s in use and transmits its present location securely to the info ingestion layer in AWS. The ingestion method isn’t in scope of this submit. As a substitute, a Lambda perform in our answer simulates pattern car journeys and instantly updates Amazon Location tracker objects with randomized places.
Knowledge analytics – Enterprise analysts collect operational insights from a number of knowledge sources, together with the situation knowledge collected from the automobiles. Knowledge analysts are on the lookout for solutions to questions resembling, “How lengthy did a given car traditionally spend inside a proposed zone, and the way a lot would the charges have value had the coverage been in place over the previous 12 months?”

The next diagram illustrates the answer structure.
Architecture diagram

The workflow consists of the next key steps:

The monitoring performance of Amazon Location is used to trace the car. Utilizing EventBridge integration, filtered positional updates are revealed to an EventBridge occasion bus. This answer makes use of distance-based filtering to scale back prices and jitter. Distanced-based filtering ignores location updates wherein gadgets have moved lower than 30 meters (98.4 ft).
Amazon Location machine place occasions arrive on the EventBridge default bus with supply: ["aws.geo"] and detail-type: ["Location Device Position Event"]. One rule is created to ahead these occasions to 2 downstream targets: a Lambda perform, and a Firehose supply stream.
Two totally different patterns, based mostly on every goal, are described on this submit to display totally different approaches to committing the info to a S3 bucket:
1. Lambda perform – The primary method makes use of a Lambda perform to display how you should utilize code within the knowledge pipeline to instantly rework the incoming location knowledge. You may modify the Lambda perform to fetch extra car data from a separate knowledge retailer (for instance, a DynamoDB desk or a Buyer Relationship Administration system) to counterpoint the info, earlier than storing the ends in an S3 bucket. On this mannequin, the Lambda perform is invoked for every incoming occasion.
2. Firehose supply stream – The second method makes use of a Firehose supply stream to buffer and batch the incoming positional updates, earlier than storing them in an S3 bucket with out modification. This methodology makes use of GZIP compression to optimize storage consumption and question efficiency. It’s also possible to use the knowledge transformation characteristic of Knowledge Firehose to invoke a Lambda perform to carry out knowledge transformation in batches.
AWS Glue crawls each S3 bucket paths, populates the AWS Glue database tables based mostly on the inferred schemas, and makes the info accessible to different analytics functions by the AWS Glue Knowledge Catalog.
Athena is used to run geospatial queries on the situation knowledge saved within the S3 buckets. The Knowledge Catalog supplies metadata that permits analytics functions utilizing Athena to seek out, learn, and course of the situation knowledge saved in Amazon S3.
This answer features a Lambda perform that constantly updates the Amazon Location tracker with simulated location knowledge from fictitious journeys. The Lambda perform is triggered at common intervals utilizing a scheduled EventBridge rule.

You may check this answer your self utilizing the AWS Samples GitHub repository. The repository incorporates the AWS Serverless Utility Mannequin (AWS SAM) template and Lambda code required to check out this answer. Discuss with the directions within the README file for steps on the way to provision and decommission this answer.

Visible layouts in some screenshots on this submit could look totally different than these in your AWS Administration Console.

Knowledge technology

On this part, we talk about the steps to manually or mechanically generate journey knowledge.

Manually generate journey knowledge

You may manually replace machine positions utilizing the AWS Command Line Interface (AWS CLI) command aws location batch-update-device-position. Change the tracker-name, device-id, Place, and SampleTime values with your personal, and be sure that successive updates are greater than 30 meters in distance aside to put an occasion on the default EventBridge occasion bus:

aws location batch-update-device-position --tracker-name <tracker-name> --updates "[{"DeviceId": "<device-id>", "Position": [<longitude>, <latitude>], "SampleTime": "<YYYY-MM-DDThh:mm:ssZ>"}]"

Mechanically generate journey knowledge utilizing the simulator

The offered AWS CloudFormation template deploys an EventBridge scheduled rule and an accompanying Lambda perform that simulates tracker updates from automobiles. This rule is enabled by default, and runs at a frequency specified by the SimulationIntervalMinutes CloudFormation parameter. The information technology Lambda perform updates the Amazon Location tracker with a randomized place offset from the automobiles’ base places.

Automobile names and base places are saved within the automobiles.json file. A car’s beginning place is reset every day, and base places have been chosen to offer them the power to float out and in of the ULEZ on a given day to supply a practical journey simulation.

You may disable the rule quickly by navigating to the scheduled rule particulars on the EventBridge console. Alternatively, change the parameter State: ENABLED to State: DISABLED for the scheduled rule useful resource GenerateDevicePositionsScheduleRule within the template.yml file. Rebuild and re-deploy the AWS SAM template for this variation to take impact.

Location knowledge pipeline approaches

The configurations outlined on this part are deployed mechanically by the offered AWS SAM template. The data on this part is offered to explain the pertinent elements of the answer.

Amazon Location machine place occasions

Amazon Location sends machine place replace occasions to EventBridge within the following format:

{
    "model":"0",
    "id":"<event-id>",
    "detail-type":"Location Machine Place Occasion",
    "supply":"aws.geo",
    "account":"<account-number>",
    "time":"<YYYY-MM-DDThh:mm:ssZ>",
    "area":"<area>",
    "sources":[
        "arn:aws:geo:<region>:<account-number>:tracker/<tracker-name>"
    ],
    "element":{
        "EventType":"UPDATE",
        "TrackerName":"<tracker-name>",
        "DeviceId":"<device-id>",
        "SampleTime":"<YYYY-MM-DDThh:mm:ssZ>",
        "ReceivedTime":"<YYYY-MM-DDThh:mm:ss.sssZ>",
        "Place":[
            <longitude>, 
            <latitude>
	]
    }
}

You may optionally specify an enter transformation to change the format and contents of the machine place occasion knowledge earlier than it reaches the goal.

Knowledge enrichment utilizing Lambda

Knowledge enrichment on this sample is facilitated by the invocation of a Lambda perform. On this instance, we name this perform ProcessDevicePosition, and use a Python runtime. A customized transformation is utilized within the EventBridge goal definition to obtain the occasion knowledge within the following format:

{
    "EventType":<EventType>,
    "TrackerName":<TrackerName>,
    "DeviceId":<DeviceId>,
    "SampleTime":<SampleTime>,
    "ReceivedTime":<ReceivedTime>,
    "Place":[<Longitude>,<Latitude>]
}

You possibly can apply extra transformations, such because the refactoring of Latitude and Longitude knowledge into separate key-value pairs if that is required by the downstream enterprise logic processing the occasions.

The next code demonstrates the Python utility logic that’s run by the ProcessDevicePosition Lambda perform. Error dealing with has been skipped on this code snippet for brevity. The complete code is accessible within the GitHub repo.

import json
import os
import uuid
import boto3

# Import atmosphere variables from Lambda perform.
bucket_name = os.environ["S3_BUCKET_NAME"]
bucket_prefix = os.environ["S3_BUCKET_LAMBDA_PREFIX"]

s3 = boto3.consumer("s3")

def lambda_handler(occasion, context):
    key = "%s/%s/%s-%s.json" % (bucket_prefix,
                                occasion["DeviceId"],
                                occasion["SampleTime"],
                                str(uuid.uuid4())
    physique = json.dumps(occasion, separators=(",", ":"))
    body_encoded = physique.encode("utf-8")
    s3.put_object(Bucket=bucket_name, Key=key, Physique=body_encoded)
    return {
        "statusCode": 200,
        "physique": "success"
    }

The previous code creates an S3 object for every machine place occasion obtained by EventBridge. The code makes use of the DeviceId as a prefix to put in writing the objects to the bucket.

You may add extra logic to the previous Lambda perform code to counterpoint the occasion knowledge utilizing different sources. The instance within the GitHub repo demonstrates enriching the occasion with knowledge from a DynamoDB car upkeep desk.

Along with the prerequisite AWS Id and Entry Administration (IAM) permissions offered by the position AWSBasicLambdaExecutionRole, the ProcessDevicePosition perform requires permissions to carry out the S3 put_object motion and every other actions required by the info enrichment logic. IAM permissions required by the answer are documented within the template.yml file.

{
    "Model":"2012-10-17",
    "Assertion":[
        {
            "Action":[
                "s3:ListBucket"
            ],
            "Useful resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>"
            ],
            "Impact":"Permit"
        },
        {
            "Motion":[
                "s3:PutObject"
            ],
            "Useful resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>/<S3_BUCKET_LAMBDA_PREFIX>/*"
            ],
            "Impact":"Permit"
        }
    ]
}

Knowledge pipeline utilizing Amazon Knowledge Firehose

Full the next steps to create your Firehose supply stream:

On the Amazon Knowledge Firehose console, select Firehose streams within the navigation pane.
Select Create Firehose stream.
For Supply, select as Direct PUT.
For Vacation spot, select Amazon S3.
For Firehose stream identify, enter a reputation (for this submit, ProcessDevicePositionFirehose).
Configure the vacation spot settings with particulars in regards to the S3 bucket wherein the situation knowledge is saved, together with the partitioning technique:
1. Use <S3_BUCKET_NAME> and <S3_BUCKET_FIREHOSE_PREFIX> to find out the bucket and object prefixes.
2. Use DeviceId as a further prefix to put in writing the objects to the bucket.
Allow Dynamic partitioning and New line delimiter to ensure partitioning is computerized based mostly on DeviceId, and that new line delimiters are added between data in objects which might be delivered to Amazon S3.

These are required by AWS Glue to later crawl the info, and for Athena to acknowledge particular person data.
Destination settings for Firehose stream

Create an EventBridge rule and fix targets

The EventBridge rule ProcessDevicePosition defines two targets: the ProcessDevicePosition Lambda perform, and the ProcessDevicePositionFirehose supply stream. Full the next steps to create the rule and fix targets:

On the EventBridge console, create a brand new rule.
For Title, enter a reputation (for this submit, ProcessDevicePosition).
For Occasion bus¸ select default.
For Rule sort¸ choose Rule with an occasion sample.
For Occasion supply, choose AWS occasions or EventBridge accomplice occasions.
For Technique, choose Use sample type.
Within the Occasion sample part, specify AWS companies because the supply, Amazon Location Service as the precise service, and Location Machine Place Occasion because the occasion sort.
For Goal 1, connect the ProcessDevicePosition Lambda perform as a goal.
We use Enter transformer to customise the occasion that’s dedicated to the S3 bucket.

Configure Enter paths map and Enter template to arrange the payload into the specified format.

The next code is the enter paths map:

{
    EventType: $.element.EventType
    TrackerName: $.element.TrackerName
    DeviceId: $.element.DeviceId
    SampleTime: $.element.SampleTime
    ReceivedTime: $.element.ReceivedTime
    Longitude: $.element.Place[0]
    Latitude: $.element.Place[1]
}

The next code is the enter template:

{
    "EventType":<EventType>,
    "TrackerName":<TrackerName>,
    "DeviceId":<DeviceId>,
    "SampleTime":<SampleTime>,
    "ReceivedTime":<ReceivedTime>,
    "Place":[<Longitude>, <Latitude>]
}

For Goal 2, select the ProcessDevicePositionFirehose supply stream as a goal.

This goal requires an IAM position that permits one or a number of data to be written to the Firehose supply stream:

{
    "Model": "2012-10-17",
    "Assertion": [
        {
            "Action": [
                "firehose:PutRecord",
                "firehose:PutRecords"
            ],
            "Useful resource": [
                "arn:aws:firehose:<region>:<account-id>:deliverystream/<delivery-stream-name>"
            ],
            "Impact": "Permit"
        }
    ]
}

Crawl and catalog the info utilizing AWS Glue

After ample knowledge has been generated, full the next steps:

On the AWS Glue console, select Crawlers within the navigation pane.
Choose the crawlers which have been created, location-analytics-glue-crawler-lambda and location-analytics-glue-crawler-firehose.
Select Run.

The crawlers will mechanically classify the info into JSON format, group the data into tables and partitions, and commit related metadata to the AWS Glue Knowledge Catalog.

When the Final run statuses of each crawlers present as Succeeded, affirm that two tables (lambda and firehose) have been created on the Tables web page.

The answer partitions the incoming location knowledge based mostly on the deviceid subject. Subsequently, so long as there are not any new gadgets or schema adjustments, the crawlers don’t have to run once more. Nevertheless, if new gadgets are added, or a special subject is used for partitioning, the crawlers have to run once more.
Tables

You’re now prepared to question the tables utilizing Athena.

Question the info utilizing Athena

Athena is a serverless, interactive analytics service constructed to investigate unstructured, semi-structured, and structured knowledge the place it’s hosted. If that is your first time utilizing the Athena console, comply with the directions to arrange a question end result location in Amazon S3. To question the info with Athena, full the next steps:

On the Athena console, open the question editor.
For Knowledge supply, select AwsDataCatalog.
For Database, select location-analytics-glue-database.
On the choices menu (three vertical dots), select Preview Desk to question the content material of each tables.

The question shows 10 pattern positional data at present saved within the desk. The next screenshot is an instance from previewing the firehose desk. The firehose desk shops uncooked, unmodified knowledge from the Amazon Location tracker.
Query results
Now you can experiment with geospatial queries.The GeoJSON file for the 2021 London ULEZ enlargement is a part of the repository, and has already been transformed into a question suitable with each Athena tables.

Copy and paste the content material from the 1-firehose-athena-ulez-2021-create-view.sql file discovered within the examples/firehose folder into the question editor.

This question makes use of the ST_Within geospatial perform to find out if a recorded place is inside or outdoors the ULEZ zone outlined by the polygon. A brand new view referred to as ulezvehicleanalysis_firehose is created with a brand new column, insidezone, which captures whether or not the recorded place exists throughout the zone.

A easy Python utility is offered, which converts the polygon options discovered within the downloaded GeoJSON file into ST_Polygon strings based mostly on the well-known textual content format that can be utilized instantly in an Athena question.

Select Preview View on the ulezvehicleanalysis_firehose view to discover its content material.

Now you can run queries towards this view to realize overarching insights.

Copy and paste the content material from the 2-firehose-athena-ulez-2021-query-days-in-zone.sql file discovered within the examples/firehose folder into the question editor.

This question establishes the overall variety of days every car has entered ULEZ, and what the anticipated complete fees could be. The question has been parameterized utilizing the ? placeholder character. Parameterized queries can help you rerun the identical question with totally different parameter values.

Enter the each day price quantity for Parameter 1, then run the question.

The outcomes show every car, the overall variety of days spent within the proposed ULEZ, and the overall fees based mostly on the each day price you entered.
Query results
You may repeat this train utilizing the lambda desk. Knowledge within the lambda desk is augmented with extra car particulars current within the car upkeep DynamoDB desk on the time it’s processed by the Lambda perform. The answer helps the next fields:

MeetsEmissionStandards (Boolean)
Mileage (Quantity)
PurchaseDate (String, in YYYY-MM-DD format)

It’s also possible to enrich the brand new knowledge because it arrives.

On the DynamoDB console, discover the car upkeep desk beneath Tables. The desk identify is offered as output VehicleMaintenanceDynamoTable within the deployed CloudFormation stack.
Select Discover desk gadgets to view the content material of the desk.
Select Create merchandise to create a brand new document for a car.
Enter DeviceId (resembling vehicle1 as a String), PurchaseDate (resembling 2005-10-01 as a String), Mileage (resembling 10000 as a Quantity), and MeetsEmissionStandards (with a price resembling False as Boolean).
Select Create merchandise to create the document.
Duplicate the newly created document with extra entries for different automobiles (resembling for vehicle2 or vehicle3), modifying the values of the attributes barely every time.
Rerun the location-analytics-glue-crawler-lambda AWS Glue crawler after new knowledge has been generated to verify that the replace to the schema with new fields is registered.
Copy and paste the content material from the 1-lambda-athena-ulez-2021-create-view.sql file discovered within the examples/lambda folder into the question editor.
Preview the ulezvehicleanalysis_lambda view to verify that the brand new columns have been created.

If errors resembling Column 'mileage' can't be resolved are displayed, the info enrichment isn’t going down, or the AWS Glue crawler has not but detected updates to the schema.

If the Preview desk possibility is simply returning outcomes from earlier than you created data within the DynamoDB desk, return the question ends in descending order utilizing sampletime (for instance, order by sampletime desc restrict 100;).
Query results
Now we give attention to the automobiles that don’t at present meet emissions requirements, and order the automobiles in descending order based mostly on the mileage per 12 months (calculated utilizing the newest mileage / age of auto in years).

Copy and paste the content material from the 2-lambda-athena-ulez-2021-query-days-in-zone.sql file discovered within the examples/lambda folder into the question editor.

On this instance, we are able to see that out of our fleet of automobiles, 5 have been reported as not assembly emission requirements. We will additionally see the automobiles which have collected excessive mileage per 12 months, and the variety of days spent within the proposed ULEZ. The fleet operator could now resolve to prioritize these automobiles for substitute. As a result of location knowledge is enriched with probably the most up-to-date car upkeep knowledge on the time it’s ingested, you may additional evolve these queries to run over an outlined time window. For instance, you would consider mileage adjustments throughout the previous 12 months.

Because of the dynamic nature of the info enrichment, any new knowledge being dedicated to Amazon S3, together with the question outcomes, will probably be altered as and when data are up to date within the DynamoDB car upkeep desk.

Clear up

Discuss with the directions within the README file to wash up the sources provisioned for this answer.

Conclusion

This submit demonstrated how you should utilize Amazon Location, EventBridge, Lambda, Amazon Knowledge Firehose, and Amazon S3 to construct a location-aware knowledge pipeline, and use the collected machine place knowledge to drive analytical insights utilizing AWS Glue and Athena. By monitoring these property in actual time and storing the outcomes, firms can derive precious insights on how successfully their fleets are being utilized and higher react to adjustments sooner or later. Now you can discover extending this pattern code with your personal machine monitoring knowledge and analytics necessities.

Concerning the Authors

Alan Peaty is a Senior Accomplice Options Architect at AWS. Alan helps World Methods Integrators (GSIs) and World Impartial Software program Distributors (GISVs) remedy advanced buyer challenges utilizing AWS companies. Previous to becoming a member of AWS, Alan labored as an architect at techniques integrators to translate enterprise necessities into technical options. Outdoors of labor, Alan is an IoT fanatic and a eager runner who likes to hit the muddy trails of the English countryside.

Parag Srivastava is a Options Architect at AWS, serving to enterprise clients with profitable cloud adoption and migration. Throughout his skilled profession, he has been extensively concerned in advanced digital transformation initiatives. He’s additionally enthusiastic about constructing progressive options round geospatial facets of addresses.