About williamrobin

Oct.02

Oct.02

Olympusat Case Study on my implementation of a serverless workflow for Lambd

About Olympusat

Olympusat is a large, independent media company specializing in ownership, corporate distribution, production, and technical services. The company has established itself as a leader in the Hispanic television and media space with more than 100 English-language and Spanish-language television networks. The company provides an over-the-top content-delivery offering called VEMOX, which enables Olympusat to distribute live channels and extensive on-demand video to smart televisions and other smart devices via the cloud. Olympusat was founded in 1999 and engages more than eight million viewers each day.

The Challenge

  • Olympusat was running a traditional PHP application, but wanted to use one language to run its backend.
  • The company’s server-based architecture required a server administrator to handle upgrades and launch new instances.
  • The company wanted to move to a microservices architecture to support both a layer of primitive functions to perform common tasks, as well as another layer of application services such as video-format translation.

The Solution

  • Olympusat realized it could develop application services more quickly by switching from a PHP application to a NodeJS implementation using Amazon API Gateway and AWS Lambda.
  • The company built a microservices architecture by creating a low-level set of primitive microservices running in an asynchronous workflow, which let the company add new delivery vehicles and data sources without adding code. Amazon API Gateway fronts an asynchronous workflow layer that is implemented using AWS Lambda, which validates incoming requests against a schema stored in Amazon Simple Storage Service (Amazon S3).
  • Once validated, the message is put in a queue in Amazon Simple Queue Service (Amazon SQS) and is added to Amazon DynamoDB.
  • The workflow function uses Amazon Simple Notification Service to communicate with the microservices, signaling to either call more services or return results.

The Benefits

  • Olympusat saves about $25,000 monthly by eliminating the use of similar, more expensive services.
  • Using one backend language managed by AWS Lambda, the company can put code into production without the need to provision infrastructure or disrupt other departments.
  • AWS Lambda eliminated the need for server administration, so more time is spent focusing on innovation and problem-solving rather than managing servers.

Learn More

To learn more about how AWS Lambda can help reduce time to market and development and operational costs, visit our AWS Lambda details page.

Uncategorized

Oct.02

Think outside of the Phone

Why Not Mix It Up? Link

Rather than build an app entirely with native or HTML5 technology, why not mix and match the technologies? With a hybrid application, building a mobile experience that leverages both native and HTML5 code for the user interface is quite possible. This enables the developer to use the most appropriate tool for the job when developing the user interface.

Clearly, developing portions of the user experience in two or more different technologies has some downsides. Chiefly, you would need developers who can handle both native development and HTML5. The portion of the user interface in native code couldn’t be readily used on other platforms and would need to be redeveloped. Given the broader scope of knowledge required by this technique and the difficulties outlined above, why would anyone want to embark on an endeavor that mixes user-interface technologies?

Uncategorized

Oct.02

Ingesting large amounts of data

Data Lake
Data Warehouses
Data Marts
 
Image
The data lake
 
Amazon Simple Storage Service (Amazon S3) is object storage with a simple web service interface to store and retrieve any amount of data from anywhere on the web. It is designed to deliver 99.999999999% durability, and scale past trillions of objects worldwide.
Customers use S3 as primary storage for cloud-native applications; as a bulk repository, or “data lake,” for analytics; as a target for backup & recovery and disaster recovery; and with serverless computing.
– Can run SQL queries against the data
– The input data is stored in it’s original state
– Ability to attach meta data to objects stored in S3 (allows us to literally query and filter things like heart rate events for a specific class directly on the s3 store)
This allows us to operate as a schema on read data lake. This allows the data to build the app, rather than have the app constrain the data. We just provide an ecosystem that encourages this process. S3 is even more powerful with the ability to store meta data directly on the objects themselves. This means we can tag sensor events by the actual member that produced them, and query those results directly from the lake.
Price and calculator

Storage Pricing (varies by region)

Region:
US East (Ohio)

Standard Storage Standard – Infrequent Access Storage † Glacier Storage
First 50 TB / month $0.023 per GB $0.0125 per GB $0.004 per GB
Next 450 TB / month $0.022 per GB $0.0125 per GB $0.004 per GB
Over 500 TB / month $0.021 per GB $0.0125 per GB $0.004 per GB
 
 
 
 
 
 
 
 
 
 

Request Pricing

Amazon S3 request costs are based on the request type, and are charged on the quantity of requests or the volume of data retrieved as listed in the table below.
Region:
US East (Ohio)

Pricing
For Requests Not Otherwise Specified Below
PUT, COPY, POST, or LIST Requests $0.005 per 1,000 requests
GET and all other Requests $0.004 per 10,000 requests
Delete Requests Free †
For Standard – Infrequent Access Requests
PUT, COPY, or POST Requests $0.01 per 1,000 requests
GET and all other Requests $0.01 per 10,000 requests
Lifecycle Transition Requests into Standard – Infrequent Access $0.01 per 1,000 requests
Data Retrievals $0.01 per GB
For Glacier Requests
Lifecycle Transition Requests into Glacier $0.05 per 1,000 requests
Glacier Retrieval Fees See Glacier Pricing Page
 No charge for delete requests of Standard objects. Objects that are archived to Glacier have a minimum 90 days of storage, and objects deleted before 90 days incur a pro-rated charge equal to the storage charge for the remaining days. Learn more. Objects that are in Standard – Infrequent Access have a minimum 30 days of storage, and objects that are deleted, overwritten, or transitioned to a different storage class before 30 days incur a pro-rated charge equal to the storage charge for the remaining days. Learn more.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Data Transfer Pricing

The pricing below is based on data transferred “in” to and “out” of Amazon S3 (over either Direct Connect or the public Internet). Transfers between S3 buckets or from S3 to any service(s) within the same region are free.
Region:
US East (Ohio)

Pricing
Data Transfer OUT From Amazon S3 To
Amazon EC2 in the same region $0.000 per GB
US East (N. Virginia) $0.010 per GB
Another AWS Region $0.020 per GB
Amazon CloudFront $0.000 per GB
Data Transfer OUT From Amazon S3 To Internet
First 1 GB / month $0.000 per GB
Up to 10 TB / month $0.090 per GB
Next 40 TB / month $0.085 per GB
Next 100 TB / month $0.070 per GB
Next 350 TB / month $0.050 per GB
Next 524 TB / month Contact Us
Next 4 PB / month Contact Us
Greater than 5 PB / month Contact Us
Data Transfer IN To Amazon S3
All data transfer in $0.000 per GB
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The data store
Amazon Aurora is a MySQL-compatible relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora provides up to five times better performance than MySQL with the security, availability, and reliability of a commercial database at one tenth the cost.
 
The data mart(s)
Data Mart A data mart is a simple form of data warehouse focused on a specific functional area or subject matter. For example, you can have specific data marts for each division in your organization or segment data marts based on regions. You can build data marts from a large data warehouse, operational stores, or a hybrid of the two. Data marts are simple to design, build, and administer. However, because data marts are focused on specific functional areas, querying across functional areas can become complex because of the distribution.Image
Uncategorized,aws

Oct.02

Syncing sensor data from local machines to the (Amazon) Cloud and vice versa

ImageStr
Reach out to live edit for the historical?
Ton
live edit
Okay so this is a test a typing whiel certain things are happening
OTB Warehouse diagram
Break down the technology used
Use AWS email sender for mail
Break down schema
Update live edit and warehouse
Look into live edits API
Issues with duplicate local database
Do we trust the client to validate it’s input
If the client pc fails, hard drive explodes, etc we have no backup as of now
Performance data can’t be matched with user id’s etc, as this data is local, and we can’t query it
Attempting to sync everything requires redeveloping the existing application
Pros
Having a local cache for high demand writes is a good idea
Diagram of the database relations, including DynamoDB
Process
I am architecting the performance type event flow to make the least amount of impact to the existing code base. Switching out a local SQL server for a remote one is a fairly trivial task, but implementing a local copy of that specific studios data, and syncing it with our database will cause many undesirable edge cases. We can build around momentary drops in internet connection, but operating without internet access brings up serious concerns.
Thus far we have been discussing the modularization of our buisness functionality, and a large majority of it will live on the Cloud. Otherwise we will have to implement the functionality on every single client we build, and will have to maintain each individually.
If our studios expect to be able to provide the best technolly to our members, they need to maintain an Internet connection.
Performance events will be queued on the client until they are posted to our DataStores.
Future considerations
We might end up developing a single API endpoint that the client application can post to directly, and that API service will be responsible
for populating the consumers (Live Edit, DynamoDB, logging, etc). This would simplify the client’s implementing significantly as they would only be responsible for posting their queued performance type objects to a single API
The backend services would be built out in either NodeJS, Java, Python (so far the only languages that AWS Lambda supports)
At this time our client (studio app) will be responsible for posting to both of these API’s
some more testing of how it is going on and try typing hows it going. Good okay, i’m sorry
This is a test sample of what it is supposed to be
Ideally the local data store would act similar to Amazon’s SQS service. Which is a queue service that works on 1st in 1st out basis.
Services consume this data by polling the queue new events, and then process the input, and generate it’s own out (saving it to Dynamo, 3rd party service like Live Edit)
For maximum use of space, we will be keeping the performance type object as small as possible.
Performance Table will look like this
Key – Type
1 – Treadmill
2 – Rower
3 – Free Weights
4 – Bike
Note that we can create search able indexes based on top level attributes, all other misc data should be stored within the Array<Object> event_data property
69, 120, 000 requests will be created in a single day for a total of 800 studios that execute 4 classes.
why doesn’t this stop so i can keep doing the work on. I’m  also trying to get this done. Cache
Uncategorized

Oct.02

Storing Large Amounts of Sensor Data on Amazon’s DynamoDB

A great example of a practical use of nosql is the data marts that were created to cache/ populate data in whatever model is required.
Image
PerformanceEventType
  1. HeartRate
  2. Treadmill
  3. Rower
  4. Bike/ strider
  5. Freeweight
You can add properties to your event
Create a model for each performance type event (Heart Rate, treadmill, etc) that extends a base class that contains event-id, event-type, and event-created
Uncategorized

Oct.02

Sep.04

AWS Data Ecosystem

Data Lake
Data Warehouses
Data Marts
 
 
The data lake
 
Amazon Simple Storage Service (Amazon S3) is object storage with a simple web service interface to store and retrieve any amount of data from anywhere on the web. It is designed to deliver 99.999999999% durability, and scale past trillions of objects worldwide.
Customers use S3 as primary storage for cloud-native applications; as a bulk repository, or “data lake,” for analytics; as a target for backup & recovery and disaster recovery; and with serverless computing.
– Can run SQL queries against the data
– The input data is stored in it’s original state
– Ability to attach meta data to objects stored in S3 (allows us to literally query and filter things like heart rate events for a specific class directly on the s3 store)
This allows us to operate as a schema on read data lake. This allows the data to build the app, rather than have the app constrain the data. We just provide an ecosystem that encourages this process. S3 is even more powerful with the ability to store meta data directly on the objects themselves. This means we can tag sensor events by the actual member that produced them, and query those results directly from the lake.
Price and calculator

Storage Pricing (varies by region)

Region:
US East (Ohio)

Standard Storage Standard – Infrequent Access Storage † Glacier Storage
First 50 TB / month $0.023 per GB $0.0125 per GB $0.004 per GB
Next 450 TB / month $0.022 per GB $0.0125 per GB $0.004 per GB
Over 500 TB / month $0.021 per GB $0.0125 per GB $0.004 per GB
 
 
 
 
 
 
 
 
 
 

Request Pricing

Amazon S3 request costs are based on the request type, and are charged on the quantity of requests or the volume of data retrieved as listed in the table below.
Region:
US East (Ohio)

Pricing
For Requests Not Otherwise Specified Below
PUT, COPY, POST, or LIST Requests $0.005 per 1,000 requests
GET and all other Requests $0.004 per 10,000 requests
Delete Requests Free †
For Standard – Infrequent Access Requests
PUT, COPY, or POST Requests $0.01 per 1,000 requests
GET and all other Requests $0.01 per 10,000 requests
Lifecycle Transition Requests into Standard – Infrequent Access $0.01 per 1,000 requests
Data Retrievals $0.01 per GB
For Glacier Requests
Lifecycle Transition Requests into Glacier $0.05 per 1,000 requests
Glacier Retrieval Fees See Glacier Pricing Page
 No charge for delete requests of Standard objects. Objects that are archived to Glacier have a minimum 90 days of storage, and objects deleted before 90 days incur a pro-rated charge equal to the storage charge for the remaining days. Learn more. Objects that are in Standard – Infrequent Access have a minimum 30 days of storage, and objects that are deleted, overwritten, or transitioned to a different storage class before 30 days incur a pro-rated charge equal to the storage charge for the remaining days. Learn more.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Data Transfer Pricing

The pricing below is based on data transferred “in” to and “out” of Amazon S3 (over either Direct Connect or the public Internet). Transfers between S3 buckets or from S3 to any service(s) within the same region are free.
Region:
US East (Ohio)

Pricing
Data Transfer IN To Amazon S3
All data transfer in $0.000 per GB
Data Transfer OUT From Amazon S3 To
Amazon EC2 in the same region $0.000 per GB
US East (N. Virginia) $0.010 per GB
Another AWS Region $0.020 per GB
Amazon CloudFront $0.000 per GB
Data Transfer OUT From Amazon S3 To Internet
First 1 GB / month $0.000 per GB
Up to 10 TB / month $0.090 per GB
Next 40 TB / month $0.085 per GB
Next 100 TB / month $0.070 per GB
Next 350 TB / month $0.050 per GB
Next 524 TB / month Contact Us
Next 4 PB / month Contact Us
Greater than 5 PB / month Contact Us
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The data store
Amazon Aurora is a MySQL-compatible relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora provides up to five times better performance than MySQL with the security, availability, and reliability of a commercial database at one tenth the cost.
 
The data mart(s)
Data Mart A data mart is a simple form of data warehouse focused on a specific functional area or subject matter. For example, you can have specific data marts for each division in your organization or segment data marts based on regions. You can build data marts from a large data warehouse, operational stores, or a hybrid of the two. Data marts are simple to design, build, and administer. However, because data marts are focused on specific functional areas, querying across functional areas can become complex because of the distribution.
aws

Jul.02