Shing Lyu

Disclaimer: This content reflects my personal opinions, not those of any organizations I am or have been affiliated with. Code samples are provided for illustration purposes only, use with caution and test thoroughly before deployment.

Understanding SageMaker Project Template Internals

Authors: Brajendra Singh, Shing Lyu

The blog post is derived from a workshop I built with Brajendra Singh, which was never published. I’m extracting the content to make a blog post. You will learn how to deploy the SageMaker provided MLOps template for model deployment and how the template works internally. If the screenshot is too small, right-click on the image and select Open Image in New Tab

MLOps is the one of the hottest topic in the field right now. Organizations are looking for ways to productionize their ML models, and MLOps is the key to repeatable results. Amazon SageMaker Projects is a feature that allows you to create a full MLOps pipeline in just a few clicks. You are going to create the MLOps pipeline using the SageMaker-provided MLOps template. This template creates the deployment pipeline, and creates a trigger to monitor if new models are approved in the SageMaker model registry, and use that as a signal to deploy it.

Why do you need a MLOps pipeline?

The MLOps pipeline you are going to deploy will help you build a robust foundation for your machine learning experiments. It automates the model deployment and testing processing so there is less room for human error. Once the model is approved in the model registry the model is deployed automatically to the staging endpoint, and an automated test is run against the staging endpoint. This help you catch problems with the model early and prevents you from deploying faulty model to production. You remain in control on what should be deployed to production, thanks to the manual approval step in the pipeline. All the pipeline configuration, CloudFormation templates and test script are managed as code in a CodeCommit repository, so you have repeatable deployment of the pipeline itself. By managing the pipeline as code, you also have better visibility on when and how the pipeline has changed. You can easily rollback any bad configuration. All these benefits gives your data scientists more confidence in experimenting fast and fail fast, because they know that they can easily rollback any failed experiments.

(continue reading...)


Using LLM to get cleaner voice transcriptions

Transcribing natural speech is a challenging task for voice-to-text services. When using traditional transcription tools, speakers are expected to talk slowly and clearly, avoiding fillers like “um” and “uh” as much as possible. However, this puts a huge cognitive load on the speaker to monitor every word before it’s spoken. Suddenly, having a normal conversation becomes an artificial performance.

Recent advances in natural language processing have enabled a better approach - transcribing first, then cleaning up the transcript afterward using a language model. The key insight is that language models can leverage contextual information and an understanding of language semantics to automatically fix many common mistakes and disfluencies in a transcript.

For example, say I’m giving an informal presentation and say:

“The, uh, model, uh no, algorithm, um, is able to…correct itself, uh, when it makes erroneous, uh, predictions…”

A language model could process this rough transcript and output:

“The algorithm is able to correct itself when it makes erroneous predictions.”

(continue reading...)


Summarize Text Quickly with Raycast and Amazon Bedrock

Raycast has become an indispensable tool in my workflow. The ability to quickly automate tasks and create custom integrations boosts my productivity daily. One common need I have is getting key summaries from long blocks of text. For example, summarize a long blog article I’m too lazy to read. Or quickly understand what is going on based on a long email thread. While there are pre-built Raycast AI extensions, I prefer to use Amazon Bedrock for privacy and security.

In the past, I used to copy the text and paste it into an AI chatbot, but this context switching was cumbersome and interrupted my flow. Raycast is a more natural way to summarize a piece of text I just copied.

(continue reading...)


How to link to external files in Joplin

I have been using Joplin for work for two years now, and I love it. Joplin is a free and open source note-taking app that lets you create notes in Markdown format, sync them across devices, and encrypt them for privacy. Joplin is also very flexible and customizable, allowing you to use plugins, themes, templates, and more to suit your needs.

One of the features that I recently discovered in Joplin is linking to local files and folders on my computer. This allows me to access my work documents without storing them in Joplin’s database, which can get too large and slow down the app. It also allows me to organize my files and folders in my own way, and take advantage of file sync services like Dropbox or Amazon WorkDocs. In this blog post, I will show you how to link to local files and folders in Joplin, and how to quickly copy the full file path from your file explorer.

(continue reading...)


Introducing the llm-chain-mock Driver for Cost-Effective LLM Testing

Llm-chain is a Rust crate that help you create advanced LLM applications such as chatbots, agents, and more. It supports various drivers, such as OpenAI, llama.cpp, and llm, that can connect to different APIs or run models locally. llm-chain allows you to easily switch between different drivers and options without a complete rewrite of your code.

One of the challenges of using llm-chain is the cost associated with invoking the LLMs. Depending on the driver you use, you may incur either an API fee or a compute resource cost for running your own model. The local models usually requires pretty powerful machines, and the setup process might be a little complicated.

(continue reading...)