Audio Guidance Service

Notice: using audio guidance service directly is a legacy way to leverage the ability of the TA SDK. It's recommended to use drive session instead. Refer to the guide of drive session for more information.

Overview

Audio guidance is a key feature of the TA SDK. It will output sentences that can be easily converted into audio by using a TTS engine.

Audio guidance service has four input sources to generate sentences.

The following two input sources will automatically generate sentences by audio guidance service: * from navigation session: in navigation mode, guidance information helps audio guidance service know the vehicle's location and related road information. * from alert service: alert information such as school zone, no parking zone, etc. will be prompted for caution.

The following two input sources will generate sentences on demand: * from client's audio request: client can request a certain type of audio prompt in a proper situation. For example, requesting audio guidance service to generate a sentence meaning "start navigation" at the beginning of navigation. * from client's repeat audio operation: client can request prompted audio guidance sentence to be broadcasted again with automatically updated context.

Key Concepts

Timing Table

Timing table is used to control when to prompt audio guidance sentence during navigation based on current road conditions.

AudioType	About the type	Turn type	Sample	Remark
kInfo	Info will be prompted when entering an extremely long step	all types	proceed on Ygnacio Valley Rd (when step length is greater than 8.5km on highway)	There are 2 types of info prompts, with distance and without distance. The one without distance is only given when the step is extremely long
kInfo	Info will be prompted when entering an extremely long step	all types	proceed 3 kilometers on Ygnacio Valley Rd (when the step length on the highway is 5km-8.5km)
kFirst	First audio is also known as INFO guidance. It will be given far from the maneuver point, so it won't contain too much information	arrival	in 800 meters you will arrive at your destination	In most cases, the first guidance won't contain the road name
		exit right	in 1.5 kilometers use the two right lanes to exit right to I 405 south towards Long Beach
		turn right	in 500 meters turn right.
		roundabout	in 500 meters enter the roundabout and take the second exit
		u-turn	in 500 meters make a u-turn if possible
kSecond	Second audio is also known as PREP guidance. It will give users most of the information to prepare for the turn	turn right	in 300 meters turn right to 34th St	In rare cases, first guidance will have exactly the same content as second guidance. So we may hear two almost same prompt For example, arrival and exit right
		roundabout	in 100 meters take the second exit to Crawford St
		u-turn	in 200 meters make a u-turn if possible at Sullivan St
		arrival	in 300 meters your destination is on your left
kThird	Third audio is also known as ACTION guidance It will be given when the user is approaching the maneuver point, so it should be short and accurate	turn right	turn right to Huntington Dr
		roundabout	take the first exit
		exit right	exit right to I 405 south
		arrival	you have reached your destination on your right
		u-turn	make a u-turn if possible at 5th Avenue
kRepeat	KRepeat audio will be given when user invokes NavigationSession::lastAudioGuidance() It will tell user how to do next	turn right	in 6 kilometers at the end of the road turn right to W Oakland Park Blvd	A repeat audio will be generated based on the current distance to the turn. When it's in the third guidance area, it will be given based on the third guidance template, otherwise it will be given based on second guidance template. One exception is that when there is a roundabout, it will always be given based on the third guidance template.

Setup an audio guidance service with only turn-by-turn navigation prompts.

// system instance is created for sharing in the whole SDK
// settings instance should be created if any default setting does not meet the requirements, could be nullptr if default values are good enough
auto service = tn::audio::AudioGuidanceServiceFactory::createInstance(system, settings);

// make a customized audio guidance prompt listener
auto listener = tn::make_shared<AudioGuidancePromptObserver>();
service->addListener(listener);

Remove listener when it's not needed anymore.

// remove alert observer
service->removeListener(listener);

Make a customized audio guidance prompt observer.

class AudioGuidancePromptObserver : public tn::audio::AudioGuidanceListener
{
public:
    void onPromptAudio(const tn::audio::AudioData& audioData) override
    {
        const auto type = audioData.audioType().GetType();
        switch (type)
        {
            case tn::audio::AudioType::Type::kInfo:
            case tn::audio::AudioType::Type::kFirst:
            case tn::audio::AudioType::Type::kSecond:
            case tn::audio::AudioType::Type::kThird:
                // handle audio guidance prompt for turn-by-turn navigation according to the timing
                // ...
                break;
            case tn::audio::AudioType::Type::kRepeat:
                // handle repeated audio guidance prompt according to latest context
                // ...
                break;
            case tn::audio::AudioType::Type::kAlert:
                // handle alert information prompts
                // ...
                break;
            default:
                // not supported prompt types
                return;
        }

        const auto style = audioData.style();
        if (style == tn::audio::AudioData::PromptStyle::kText)
        {
            const auto& content = audioData.audioContent();
            auto sentence = content.sentence;
            int index = 1;
            for (const auto& token : content.tokens)
            {
                if (token.phoneme.empty())
                {
                    // use orthography if there's no phoneme
                    sentence = replaceTokenWithOrthography(sentence, index, token.orthography, token.orthographyCode);
                }
                else
                {
                    // use phoneme for better speech result
                    sentence = replaceTokenWithPhoneme(sentence, index, token.phoneme, token.phonemeCode);
                }

                index++;
            }

            // play sentence in TTS engine
            tts.say(sentence);
        }
        else
        {
            // just play a tone
            tts.tone();
        }
    }
};

Request a sentence to be generated.

tn::audio::AudioRequest request(tn::audio::AudioRequest::Type::kStartNavigation);
const auto data = service->requestAudio(request);

assert(data.audioType().isRequest());
const auto& content = data.audioContent();
tts.say(content.sentence);

Optimize Text Broadcast

The TA SDK provides both phonetic symbols and abbreviation optimization. Usually, the TA SDK will use phonetic symbols to broadcast. However, the system can convert abbreviated text into full text when phonetic symbols don't work. For example, convert "Palm AVE." into "Palm Avenue."

Abbreviation optimization is disabled by default. Turn it on at the creation of audio guidance service.

// turn on abbreviation optimization
std::string config = R"json(
{
    "OrthographyOptimize": 
    {
        "Enabled": true
    }
}
)json";

const auto settings = tn::foundation::Settings::Builder()
    .setString(tn::audio::SettingConstants::SETTING_JSON_CONTENT, config)
    .build();

// create an audio guidance service with abbreviation optimization enabled
auto service = tn::audio::AudioGuidanceServiceFactory::createInstance(system, settings);

Optimize Grammar

Optimize the use of grammatical structures in some specific languages. For example, in German, the TA SDK will choose the correct feminine, masculine, or neutral preposition before a road name or a POI.

Support Different Broadcast Level in Audio Guidance

Client can select various broadcast levels, including low, medium, full, commute, and tone-only levels.

service->setVerbosityLevel(tn::audio::AudioGuidanceService::VerbosityLevel::kVerbose);

Alerting ON/OFF Switch

The TA SDK provides about 30 types of alerts and all of them have an independent switch.

Setup an audio guidance service with both turn-by-turn navigation and alert prompts.

// system instance is created for sharing in the whole SDK
// settings instance should be created if any default setting does not meet the requirements, could be nullptr if default values are good enough
// alertService is used for receiving alert information for audio guidance service to prompt related contents
auto audioService = tn::audio::AudioGuidanceServiceFactory::createInstance(system, settings, alertService);

const std::map<tn::alert::AlertType, bool> alertSwitches {
    {tn::alert::AlertType::schoolZone(), false}
};
audioService->enableAlertPrompt(alertSwitches);

const std::map<tn::alert::ViolationType, bool> violationSwitches = {
    {tn::alert::ViolationType::kOverSpeed, false}
};
audioService->enableViolationPrompt(violationSwitches);

Independent Language Setting

Client can set a broadcast language different from the system language. For example, when the system language is Croatian, the user can use only the English TTS engine and set the audio language as English.

system->updateSystemLocale("hr_HR");

service->overrideLocale("en_US");

Tight Turn Alert

When there is a tight turn, audio will be broadcast on the last road.

Arrival Guidance

When the user arrives near the set destination, the TA SDK will broadcast where the destination is. When the set destination is inaccessible by car, the system will broadcast the nearest drivable area for users.

Signpost Priority

Users can get the signpost information more clearly and directly when driving, so in the case of conflict between road signs and signposts, the signpost will be broadcast in preference.

Support Language

locale code	Language name	Turn by turn supported status	alert audio supported status
ar	Arabic	supported	supported
cs_CZ	Czech	supported	supported
da_DK	Danish	supported	supported
de_DE	German	supported	supported
el_GR	Greek	supported	supported
en_AU	English (Australia)	supported	supported
en_GB	English (United Kingdom)	supported	supported
en_US	English (United States)	supported	supported
es_ES	Spanish (Spain)	supported	supported
es_MX	Spanish (Mexico)	supported	supported
fi_FI	Finnish	supported	supported
fr_CA	French (Canada)	supported	supported
hr_HR	Croatian	supported	not supported
hu_HU	Hungarian	supported	not supported
it_IT	Italian	supported	supported
nl_NL	Dutch	supported	supported
no_NO	Norwegian	supported	supported
pl_PL	Polish	supported	supported
pt_PT	Portuguese (Portugal)	supported	supported
ro_RO	Romanian	supported	supported
ru_RU	Russian	supported	supported
sk_SK	Slovak	supported	supported
sv_SE	Swedish	supported	supported
th_TH	Thai	supported	supported
tr_TR	Turkish	supported	supported