A common trait found among professional software developers is their ability to not allow the implementation details of their job to become "precious".
It is easy to become enamored with a particular framework or technology and forget the primary reason we program in the first place. We do not program merely because React, Angular, Vue, Svelte, Ionic, NativeScript, TypeScript, etc. exist. We program because of what these technologies allow us to do. We program because it is the absolute best option to put something into someone's hands that will solve a real problem they are having, making their lives easier, and maybe even bring joy into their world.
NativeScript Preview can dramatically reduce the time and cognitive investment it takes to convert an idea into a valuable feature.
Let's explore a fun text-to-speech example with non-trivial implications.
For a high-level view of how NativeScript Preview works, you can learn more here.
In a nutshell, you can install the NativeScript Preview app to scan a QR code on StackBlitz, pushing that app to your device. As you develop, your changes will be pushed to your phone in realtime. Pretty cool!
The example we want to demo is in the link below and let’s go!
To start things off, we will segment our speech
component into its own directory and follow a common pattern used within the NativeScript community. The NativeScript compiler will generate platform-specific JavaScript bundles, so you could technically mix your iOS and Android code in a single file, knowing the non-targeted platform code will be pruned for you. However a cleaner approach is to completely separate the iOS and Android code and program to a type definition that determines the API for that component regardless of the platform.
You can see what a typical file structure looks like in the image below.
You will notice an additional common.ts
file in the directory, which we use to abstract common elements such as enums and interfaces. For example, in the code below, we define a Languages
enum to control the languages we can use and a SpeakOptions
interface for type safety.
//app/speech/common.ts
export enum Languages {
American = 'en-US',
British = 'en-GB',
Spanish = 'es-ES',
Australian = 'en-AU',
}
export interface SpeakOptions {
text: string;
language?: string;
finishedCallback?: Function;
}
Within our index.d.ts
file, we will define the shape of the TextToSpeech
class that we will guide each platform implementation. Finally, we will define five basic functions we will implement within the index.ios.ts
and index.android.ts
files.
//app/speech/index.d.ts
export * from './common';
export declare class TextToSpeech {
speak(options: SpeakOptions): void;
stop(): void;
destroy(): void;
}
In the example, we have an iOS and Android implementation, but to avoid redundancy we will focus solely on the iOS implementation. The same principles described apply to all platforms NativeScript supports.
We will begin our tour with a redacted version of the TextToSpeech
class so you can see the high-level pieces before we step through each function individually and talk about how they work. The most important thing to notice is that this class implements the five methods we previously defined in our type definition.
//app/speech/index.ios.ts
import { SpeakOptions } from './common';
export * from './common';
export class TextToSpeech {
utterance: AVSpeechUtterance;
speech: AVSpeechSynthesizer;
delegate: MySpeechDelegate;
options: SpeakOptions;
speak(options: SpeakOptions) { }
stop() { }
destroy() { }
private init() { }
}
The utterance
and speech
instances are used to do the talking while the delegate
and options
members are used to communicate with the outside world.
We will walk through the delegate
implementation in a moment. If you are unfamiliar with this concept, it is a common pattern in iOS development to use a delegate to handle specific tasks for a concrete instance, such as event dispatching. You can learn more here.
utterance: AVSpeechUtterance;
speech: AVSpeechSynthesizer;
delegate: MySpeechDelegate;
options: SpeakOptions;
The speak
method is pretty cut and dry. We initialize some things and then call this.speech.speakUtterance
and pass in an utterance
instance. But... what is really happening here?
speak(options: SpeakOptions) {
this.init();
this.options = options;
this.utterance = AVSpeechUtterance.alloc().initWithString(options.text);
this.utterance.voice = AVSpeechSynthesisVoice.voiceWithLanguage(options.language);
this.speech.speakUtterance(this.utterance);
}
We are detecting whether or not we have an instance of speech
which leads us down one of two paths. If we have an instance, we do not need to re-instantiate the speech synthesizier; instead, we just need to stop any speech that may be happening. If we do not have an instance, we will instantiate an instance of AVSpeechSynthesizer and assign it to speech
. We will also assign an instance of MySpeechDelegate
to delegate
and connect the delegate to the speech instance.
private init() {
if (!this.speech) {
this.speech = AVSpeechSynthesizer.alloc().init();
this.delegate = MySpeechDelegate.initWithOwner(new WeakRef(this));
this.speech.delegate = this.delegate;
} else {
this.stop();
}
}
The stop
method detects if speaking is in progress, and if so, it instructs the AVSpeechSynthesizer
instance to stop speaking immediately.
stop() {
if (this.speech?.speaking) {
this.speech.stopSpeakingAtBoundary(AVSpeechBoundary.Immediate);
}
}
By setting the speech
to null
, we are ensuring the AVSpeechSynthesizer
instance is cleaned up.
destroy() {
this.speech = null;
}
To learn more about iOS delegates, this resource is a good read.
In a nutshell, "Delegation is a design pattern that enables a class or structure to hand off (or delegate) some of its responsibilities to an instance of another type.""
The delegate class can come in several forms based on what you are trying to accomplish and will be defined by the interface you implement. In our case, we are implementing the AVSpeechSynthesizerDelegate interface, which requires us to implement the methods you see below. NativeScript makes the actual connection via the static ObjCProtocols
property which can take a collection of any number of iOS protocols.
Let us take a moment and talk about how this delegate is being instantiated.
@NativeClass
class MySpeechDelegate extends NSObject implements AVSpeechSynthesizerDelegate {
owner: WeakRef<TextToSpeech>;
static ObjCProtocols = [AVSpeechSynthesizerDelegate];
static initWithOwner(owner: WeakRef<TextToSpeech>) {
const delegate = <MySpeechDelegate>MySpeechDelegate.new();
delegate.owner = owner;
return delegate;
}
speechSynthesizerDidStartSpeechUtterance() {}
speechSynthesizerDidFinishSpeechUtterance() {}
speechSynthesizerDidPauseSpeechUtterance() {}
speechSynthesizerDidContinueSpeechUtterance() {}
speechSynthesizerDidCancelSpeechUtterance() {}
}
The static initWithOwner
pattern is also pretty common among iOS development due to it's heavy usage of the delegate design pattern.
We are defining a WeakRef
reference to our TextToSpeech
instance and assigning that to the owner
property on our delegate. By using WeakRef
, we can hold a weak reference to another object without preventing that object from getting garbage-collected. You can read more about this feature in the WeakRef MDN Documentation.
The delegate itself is created within the static initWithOwner
method, which takes an owner
parameter that gets assigned to the delegate instance after we have called MySpeechDelegate.new()
. A unique NativeScript construct here is the usage of .new()
which is merely a convenience added to the JavaScript language by NativeScript's runtime which helps instantiate native classes where you may not otherwise have a specific initializer to call. Once the delegate has been created and the owner assigned, we return the delegate back to the owner for future delegation-related responsibilities.
The one method that is going to do the "real work" in our delegate is speechSynthesizerDidFinishSpeechUtterance
which not only interacts with the SpeechToText
instance but the callback method that the Angular app passes in via the SpeakOptions
object. We are going to try to get a reference to the owner by calling this.owner?.deref();
and if we have an instance, we will call stop
and the finishCallBack
method that comes from the speech options object that is defined by the Angular app.
speechSynthesizerDidFinishSpeechUtterance(
synthesizer: AVSpeechSynthesizer,
utterance: AVSpeechUtterance
) {
const owner = this.owner?.deref();
if (owner) {
owner.stop();
owner.options.finishedCallback();
}
}
The other four methods will simply log the event and utterance.speechString
. We could fill this in later, depending on the user requirements.
speechSynthesizerDidStartSpeechUtterance(
synthesizer: AVSpeechSynthesizer,
utterance: AVSpeechUtterance
) {
console.log(
'speechSynthesizerDidStartSpeechUtterance:',
utterance.speechString
);
}
speechSynthesizerDidPauseSpeechUtterance(
synthesizer: AVSpeechSynthesizer,
utterance: AVSpeechUtterance
) {
console.log(
'speechSynthesizerDidPauseSpeechUtterance:',
utterance.speechString
);
}
speechSynthesizerDidContinueSpeechUtterance(
synthesizer: AVSpeechSynthesizer,
utterance: AVSpeechUtterance
) {
console.log(
'speechSynthesizerDidContinueSpeechUtterance:',
utterance.speechString
);
}
speechSynthesizerDidCancelSpeechUtterance(
synthesizer: AVSpeechSynthesizer,
utterance: AVSpeechUtterance
) {
console.log(
'speechSynthesizerDidCancelSpeechUtterance:',
utterance.speechString
);
}
The Angular portion of the application maintains the status quo by sticking to convention. It is pretty awesome to have the ability to interface with native platform APIs using JavaScript and equally awesome to be able to abstract those implementation details away so that Angular is oblivious to the brave new world it is living in. As far as the Angular application is concerned, it's just Angular.
The one thing we are doing to do is tighten up the connection between Angular and the hardware by using NgZone
to ensure that our Angular bindings update if a native event is triggered.
import { Component, OnInit, inject, NgZone } from '@angular/core';
import { Languages, TextToSpeech, SpeakOptions } from './speech';
@Component({
selector: 'ns-talk',
templateUrl: './talk.component.html',
})
export class TalkComponent implements OnInit {
TTS: TextToSpeech;
ngZone = inject(NgZone);
speaking = false;
speechText = 'Thank you for exploring NativeScript with StackBlitz!';
speakOptions: SpeakOptions = {
text: 'Whatever you like',
language: Languages.Australian,
finishedCallback: () => {
this.reset();
console.log('Finished Speaking!');
},
};
ngOnInit() {
this.TTS = new TextToSpeech();
}
talk() { }
stop() { }
reset() { }
}
If you remember, in the previous section, we fired a callback function in the delegate. You can see that function is defined on the speakOptions
object, which calls reset
and logs to the console. We are also instantiating a TextToSpeech
instance inside ngOnInit
and assigning it to TTS
.
The talk
method does some preliminary work by assigning speechText
to the speakOptions.text
options and setting the speaking
flag to true. Finally, TTS.speak
is called with the updated options.
talk() {
this.speakOptions.text = this.speechText;
this.speaking = true;
this.TTS.speak(this.speakOptions);
}
The stop
method calls the stop
method on the TTS
instance and also calls reset
.
stop() {
this.TTS.stop();
this.reset();
}
The reset
method is interesting because it uses NgZone
to manage change detection. Everytime reset
is called, ngZone.run
is called, which then fires a callback function which sets speaking
to false
.
reset() {
this.ngZone.run(() => {
// since stop can be called as result of native API interaction
// we call within ngZone to ensure view bindings update
this.speaking = false;
});
}
Assuming a basic level of Angular understanding, there is not much to discuss about this component template. We have a [(ngModel)]
to update speechText
which is used to update the options we pass to our platform native API when the talk button is tapped.
<ActionBar title="Text to Audible Speech"> </ActionBar>
<StackLayout class="p-30">
<Label
text="Be sure to turn your speaker on and volume up."
class="instruction"
textWrap="true"
width="250"
></Label>
<TextView
[(ngModel)]="speechText"
hint="Type text to speak..."
class="input"
height="100"
></TextView>
<Button
text="Talk to me!"
class="btn-primary m-t-20"
(tap)="talk()"
></Button>
<Button
text="Stop"
(tap)="stop()"
[visibility]="speaking ? 'visible' : 'hidden'"
class="btn-secondary"
></Button>
</StackLayout>
It's always a neat moment when developers experience realtime updates from Stackblitz to their phone and realized how empowering native APIs can be. The goal is for it to feel like every JavaScript application you've already written.
We hope you enjoyed the text to speech demo.