Skip to content

Audio Guidance Service

Audio Guidance Service

Notice: using audio guidance service directly is a legacy way to leverage the ability of the TA SDK. It's recommended to use drive session instead. Refer to the guide of drive session for more information.

Overview

Audio guidance is a key feature of the TA SDK. It will output sentences that can be easily converted into audio by using a TTS engine.

Audio guidance service has four input sources to generate sentences.

The following two input sources will automatically generate sentences by audio guidance service: * from navigation session: in navigation mode, guidance information helps audio guidance service know the vehicle's location and related road information. * from alert service: alert information such as school zone, no parking zone, etc. will be prompted for caution.

The following two input sources will generate sentences on demand: * from client's audio request: client can request a certain type of audio prompt in a proper situation. For example, requesting audio guidance service to generate a sentence meaning "start navigation" at the beginning of navigation. * from client's repeat audio operation: client can request prompted audio guidance sentence to be broadcasted again with automatically updated context.

Key Concepts

Timing Table

Timing table is used to control when to prompt audio guidance sentence during navigation based on current road conditions.

AudioType About the type Turn type Sample Remark
kInfo Info will be prompted when entering an extremely long step all types proceed on Ygnacio Valley Rd (when step length is greater than 8.5km on highway) There are 2 types of info prompts, with distance and without distance. The one without distance is only given when the step is extremely long
proceed 3 kilometers on Ygnacio Valley Rd (when the step length on the highway is 5km-8.5km)
kFirst First audio is also known as INFO guidance.
It will be given far from the maneuver point, so it won't contain too much information
arrival in 800 meters you will arrive at your destination In most cases, the first guidance won't contain the road name
exit right in 1.5 kilometers use the two right lanes to exit right to I 405 south towards Long Beach
turn right in 500 meters turn right.
roundabout in 500 meters enter the roundabout and take the second exit
u-turn in 500 meters make a u-turn if possible
kSecond Second audio is also known as PREP guidance.
It will give users most of the information to prepare for the turn
turn right in 300 meters turn right to 34th St In rare cases, first guidance will have exactly the same content as second guidance.
So we may hear two almost same prompt
For example, arrival and exit right
roundabout in 100 meters take the second exit to Crawford St
u-turn in 200 meters make a u-turn if possible at Sullivan St
arrival in 300 meters your destination is on your left
kThird Third audio is also known as ACTION guidance
It will be given when the user is approaching the maneuver point, so it should be short and accurate
turn right turn right to Huntington Dr
roundabout take the first exit
exit right exit right to I 405 south
arrival you have reached your destination on your right
u-turn make a u-turn if possible at 5th Avenue
kRepeat KRepeat audio will be given when user invokes NavigationSession::lastAudioGuidance()
It will tell user how to do next
turn right in 6 kilometers at the end of the road turn right to W Oakland Park Blvd A repeat audio will be generated based on the current distance to the turn. When it's in the third guidance area, it will be given based on the third guidance template, otherwise it will be given based on second guidance template. One exception is that when there is a roundabout, it will always be given based on the third guidance template.

Setup an audio guidance service with only turn-by-turn navigation prompts.

1
2
3
4
5
6
7
// system instance is created for sharing in the whole SDK
// settings instance should be created if any default setting does not meet the requirements, could be nullptr if default values are good enough
auto service = tn::audio::AudioGuidanceServiceFactory::createInstance(system, settings);

// make a customized audio guidance prompt listener
auto listener = tn::make_shared<AudioGuidancePromptObserver>();
service->addListener(listener);

Remove listener when it's not needed anymore.

// remove alert observer
service->removeListener(listener);

Make a customized audio guidance prompt observer.

class AudioGuidancePromptObserver : public tn::audio::AudioGuidanceListener
{
public:
    void onPromptAudio(const tn::audio::AudioData& audioData) override
    {
        const auto type = audioData.audioType().GetType();
        switch (type)
        {
            case tn::audio::AudioType::Type::kInfo:
            case tn::audio::AudioType::Type::kFirst:
            case tn::audio::AudioType::Type::kSecond:
            case tn::audio::AudioType::Type::kThird:
                // handle audio guidance prompt for turn-by-turn navigation according to the timing
                // ...
                break;
            case tn::audio::AudioType::Type::kRepeat:
                // handle repeated audio guidance prompt according to latest context
                // ...
                break;
            case tn::audio::AudioType::Type::kAlert:
                // handle alert information prompts
                // ...
                break;
            default:
                // not supported prompt types
                return;
        }

        const auto style = audioData.style();
        if (style == tn::audio::AudioData::PromptStyle::kText)
        {
            const auto& content = audioData.audioContent();
            auto sentence = content.sentence;
            int index = 1;
            for (const auto& token : content.tokens)
            {
                if (token.phoneme.empty())
                {
                    // use orthography if there's no phoneme
                    sentence = replaceTokenWithOrthography(sentence, index, token.orthography, token.orthographyCode);
                }
                else
                {
                    // use phoneme for better speech result
                    sentence = replaceTokenWithPhoneme(sentence, index, token.phoneme, token.phonemeCode);
                }

                index++;
            }

            // play sentence in TTS engine
            tts.say(sentence);
        }
        else
        {
            // just play a tone
            tts.tone();
        }
    }
};

Request a sentence to be generated.

1
2
3
4
5
6
tn::audio::AudioRequest request(tn::audio::AudioRequest::Type::kStartNavigation);
const auto data = service->requestAudio(request);

assert(data.audioType().isRequest());
const auto& content = data.audioContent();
tts.say(content.sentence);

Optimize Text Broadcast

The TA SDK provides both phonetic symbols and abbreviation optimization. Usually, the TA SDK will use phonetic symbols to broadcast. However, the system can convert abbreviated text into full text when phonetic symbols don't work. For example, convert "Palm AVE." into "Palm Avenue."

Abbreviation optimization is disabled by default. Turn it on at the creation of audio guidance service.

// turn on abbreviation optimization
std::string config = R"json(
{
    "OrthographyOptimize": 
    {
        "Enabled": true
    }
}
)json";

const auto settings = tn::foundation::Settings::Builder()
    .setString(tn::audio::SettingConstants::SETTING_JSON_CONTENT, config)
    .build();

// create an audio guidance service with abbreviation optimization enabled
auto service = tn::audio::AudioGuidanceServiceFactory::createInstance(system, settings);

Optimize Grammar

Optimize the use of grammatical structures in some specific languages. For example, in German, the TA SDK will choose the correct feminine, masculine, or neutral preposition before a road name or a POI.

Support Different Broadcast Level in Audio Guidance

Client can select various broadcast levels, including low, medium, full, commute, and tone-only levels.

service->setVerbosityLevel(tn::audio::AudioGuidanceService::VerbosityLevel::kVerbose);
Alerting ON/OFF Switch

The TA SDK provides about 30 types of alerts and all of them have an independent switch.

Setup an audio guidance service with both turn-by-turn navigation and alert prompts.

// system instance is created for sharing in the whole SDK
// settings instance should be created if any default setting does not meet the requirements, could be nullptr if default values are good enough
// alertService is used for receiving alert information for audio guidance service to prompt related contents
auto audioService = tn::audio::AudioGuidanceServiceFactory::createInstance(system, settings, alertService);

const std::map<tn::alert::AlertType, bool> alertSwitches {
    {tn::alert::AlertType::schoolZone(), false}
};
audioService->enableAlertPrompt(alertSwitches);

const std::map<tn::alert::ViolationType, bool> violationSwitches = {
    {tn::alert::ViolationType::kOverSpeed, false}
};
audioService->enableViolationPrompt(violationSwitches);

Independent Language Setting

Client can set a broadcast language different from the system language. For example, when the system language is Croatian, the user can use only the English TTS engine and set the audio language as English.

1
2
3
system->updateSystemLocale("hr_HR");

service->overrideLocale("en_US");
Tight Turn Alert

When there is a tight turn, audio will be broadcast on the last road.

Arrival Guidance

When the user arrives near the set destination, the TA SDK will broadcast where the destination is. When the set destination is inaccessible by car, the system will broadcast the nearest drivable area for users.

Signpost Priority

Users can get the signpost information more clearly and directly when driving, so in the case of conflict between road signs and signposts, the signpost will be broadcast in preference.

Support Language
locale code Language name Turn by turn supported status alert audio supported status
ar Arabic supported supported
cs_CZ Czech supported supported
da_DK Danish supported supported
de_DE German supported supported
el_GR Greek supported supported
en_AU English (Australia) supported supported
en_GB English (United Kingdom) supported supported
en_US English (United States) supported supported
es_ES Spanish (Spain) supported supported
es_MX Spanish (Mexico) supported supported
fi_FI Finnish supported supported
fr_CA French (Canada) supported supported
hr_HR Croatian supported not supported
hu_HU Hungarian supported not supported
it_IT Italian supported supported
nl_NL Dutch supported supported
no_NO Norwegian supported supported
pl_PL Polish supported supported
pt_PT Portuguese (Portugal) supported supported
ro_RO Romanian supported supported
ru_RU Russian supported supported
sk_SK Slovak supported supported
sv_SE Swedish supported supported
th_TH Thai supported supported
tr_TR Turkish supported supported