Introduction

Post-call transcription and summary is a powerful feature provided by VideoSDK that allows users to generate detailed transcriptions and summaries of recorded meetings after they have concluded. This feature is particularly beneficial for capturing and documenting important information discussed during meetings, ensuring that nothing is missed and that there is a comprehensive record of the conversation.

How Post-Call Transcription Works?

Post-call transcription involves processing the recorded audio or video content of a meeting to produce a textual representation of the conversation. Here’s a step-by-step breakdown of how it works:

  1. Recording the Meeting: During the meeting, the audio and video are recorded. This can include everything that was said and any shared content, such as presentations or screen shares.
  2. Uploading the Recording: Once the meeting is over, the recorded file is uploaded to the VideoSDK platform. This can be done automatically or manually, depending on the configuration.
  3. Transcription Processing: The uploaded recording is then processed by VideoSDK’s transcription engine. This engine uses advanced speech recognition technology to convert spoken words into written text.
  4. Retrieving the Transcription: After the transcription process is complete, the textual representation of the meeting is made available. This text can be accessed via the VideoSDK API and used in various applications.
Video SDK Image

Benefits of Post-Call Transcription

  • Accurate Documentation: Provides a precise record of what was discussed, which is invaluable for meeting minutes, legal documentation, and reference.
  • Enhanced Accessibility: Makes content accessible to those who may have missed the meeting or have hearing impairments.
  • Easy Review and Analysis: Enables quick review of key points and decisions made during the meeting without having to re-watch the entire recording.

Let's Get started

VideoSDK empowers you to seamlessly integrate the video calling feature into your Flutter application within minutes.

In this quickstart, you'll explore the group calling feature of VideoSDK. Follow the step-by-step guide to integrate it within your application.

Prerequisites


Before proceeding, ensure your development environment meets the following requirements:

  • Video SDK Developer Account: If you don't have one, you can create it by following the instructions on the Video SDK Dashboard.
  • Basic Understanding of Flutter: Familiarity with Flutter development is necessary.
  • Flutter Video SDK: Ensure you have the Flutter Video SDK installed.
  • Flutter Installation: Flutter should be installed on your device.
  • One should have a VideoSDK account to generate tokens. Visit the VideoSDK dashboard to generate a token.

Getting Started with the Code!

Follow the steps to create the environment necessary to add video calls to your app. You can also find the code sample for quickstart here.

Create a new Flutter project.

Create a new Flutter App using the below command.

Install Video SDK

Install the VideoSDK using the below-mentioned flutter command. Make sure you are in your Flutter app directory before you run this command.

//run this command in terminal to add videoSDK 
flutter pub add videosdk
//run this command to add http library to perform network call to generate roomId
flutter pub add http
root
├── android
├── ios
├── lib
     ├── api_call.dart
     ├── join_screen.dart
     ├── main.dart
     ├── meeting_controls.dart
     ├── meeting_screen.dart
     ├── participant_tile.dart
Project Files Structure

App Structure

The app widget will contain JoinScreen and MeetingScreen widget. MeetingScreen will have MeetingControls and ParticipantTile widget.

Video SDK Image

Configure Project

For Android

  • Update the /android/app/src/main/AndroidManifest.xml for the permissions we will be using to implement the audio and video features.
<uses-feature android:name="android.hardware.camera" />
<uses-feature android:name="android.hardware.camera.autofocus" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.CHANGE_NETWORK_STATE" />
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
<uses-permission android:name="android.permission.INTERNET"/>
<uses-permission android:name="android.permission.FOREGROUND_SERVICE"/>
<uses-permission android:name="android.permission.WAKE_LOCK" />
AndroidManifest.xml
  • Also, you will need to set your build settings to Java 8 because the official WebRTC jar now uses static methods in EglBase interface. Just add this to your app-level /android/app/build.gradle.
android {
    //...
    compileOptions {
        sourceCompatibility JavaVersion.VERSION_1_8
        targetCompatibility JavaVersion.VERSION_1_8
    }
}
  • If necessary, in the same build.gradle you will need to increase minSdkVersion of defaultConfig up to 23 (currently, the default Flutter generator is set it to 16).
  • If necessary, in the same build.gradle you will need to increase compileSdkVersion and targetSdkVersion up to 31 (currently, the default Flutter generator is set it to 30).

For iOS

  • Add the following entries which allow your app to access the camera and microphone of your /ios/Runner/Info.plist file :
<key>NSCameraUsageDescription</key>
<string>$(PRODUCT_NAME) Camera Usage!</string>
<key>NSMicrophoneUsageDescription</key>
<string>$(PRODUCT_NAME) Microphone Usage!</string>
Info.plist
  • Uncomment the following line to define a global platform for your project in /ios/Podfile :
# platform :ios, '12.0'
Podfile

For MacOS

  • Add the following entries to your /macos/Runner/Info.plist file which allow your app to access the camera and microphone.
<key>NSCameraUsageDescription</key>
<string>$(PRODUCT_NAME) Camera Usage!</string>
<key>NSMicrophoneUsageDescription</key>
<string>$(PRODUCT_NAME) Microphone Usage!</string>
Info.plist
  • Add the following entries to your /macos/Runner/DebugProfile.entitlements file which allows your app to access the camera, and microphone and open outgoing network connections.
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.device.camera</key>
<true/>
<key>com.apple.security.device.microphone</key>
<true/>
DebugProfile.entitlements
  • Add the following entries to your /macos/Runner/Release.entitlements file which allows your app to access the camera, and microphone and open outgoing network connections.
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.device.camera</key>
<true/>
<key>com.apple.security.device.microphone</key>
<true/>
Release.entitlements

Step 1: Get started with api_call.dart

Before jumping to anything else, you will write a function to generate a unique meetingId. You will require an auth token, you can generate it using either by using videosdk-rtc-api-server-examples or generate it from the Video SDK Dashboard for development.

import 'dart:convert';
import 'package:http/http.dart' as http;

//Auth token we will use to generate a meeting and connect to it
String token = "<Generated-from-dashboard>";

// API call to create meeting
Future<String> createMeeting() async {
  final http.Response httpResponse = await http.post(
    Uri.parse("https://api.videosdk.live/v2/rooms"),
    headers: {'Authorization': token},
  );

//Destructuring the roomId from the response
  return json.decode(httpResponse.body)['roomId'];
}
api_call.dart

Step 2: Creating the JoinScreen

Let's create join_screen.dart file in lib directory and create JoinScreen StatelessWidget.

The JoinScreen will consist of:

  • Create a Meeting Button - This button will create a new meeting for you.
  • Meeting ID TextField - This text field will contain the meeting ID, you want to join.
  • Join Meeting Button - This button will join the meeting, which you have provided.
import 'package:flutter/material.dart';
import 'api_call.dart';
import 'meeting_screen.dart';

class JoinScreen extends StatelessWidget {
  final _meetingIdController = TextEditingController();

  JoinScreen({super.key});

  void onCreateButtonPressed(BuildContext context) async {
    // call api to create meeting and then navigate to MeetingScreen with meetingId,token
    await createMeeting().then((meetingId) {
      if (!context.mounted) return;
      Navigator.of(context).push(
        MaterialPageRoute(
          builder: (context) => MeetingScreen(
            meetingId: meetingId,
            token: token,
          ),
        ),
      );
    });
  }

  void onJoinButtonPressed(BuildContext context) {
    String meetingId = _meetingIdController.text;
    var re = RegExp("\\w{4}\\-\\w{4}\\-\\w{4}");
    // check meeting id is not null or invaild
    // if meeting id is vaild then navigate to MeetingScreen with meetingId,token
    if (meetingId.isNotEmpty && re.hasMatch(meetingId)) {
      _meetingIdController.clear();
      Navigator.of(context).push(
        MaterialPageRoute(
          builder: (context) => MeetingScreen(
            meetingId: meetingId,
            token: token,
          ),
        ),
      );
    } else {
      ScaffoldMessenger.of(context).showSnackBar(const SnackBar(
        content: Text("Please enter valid meeting id"),
      ));
    }
  }

  
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: const Text('VideoSDK QuickStart'),
      ),
      body: Padding(
        padding: const EdgeInsets.all(12.0),
        child: Column(
          mainAxisAlignment: MainAxisAlignment.center,
          children: [
            ElevatedButton(
              onPressed: () => onCreateButtonPressed(context),
              child: const Text('Create Meeting'),
            ),
            Container(
              margin: const EdgeInsets.fromLTRB(0, 8.0, 0, 8.0),
              child: TextField(
                decoration: const InputDecoration(
                  hintText: 'Meeting Id',
                  border: OutlineInputBorder(),
                ),
                controller: _meetingIdController,
              ),
            ),
            ElevatedButton(
              onPressed: () => onJoinButtonPressed(context),
              child: const Text('Join Meeting'),
            ),
          ],
        ),
      ),
    );
  }
}
join_screen.dart
Video SDK Image
  • Update the home screen of the app in the main.dart
import 'package:flutter/material.dart';
import 'join_screen.dart';

void main() {
  runApp(const MyApp());
}

class MyApp extends StatelessWidget {
  const MyApp({super.key});

  // This widget is the root of your application.
  
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'VideoSDK QuickStart',
      theme: ThemeData(
        primarySwatch: Colors.blue,
      ),
      home: JoinScreen(),
    );
  }
}
main.dart

Output

Step 3: Creating the MeetingControls

Let's create meeting_controls.dart file and create MeetingControls StatelessWidget.

The MeetingControls will consist of:

  • Leave Button - This button will leave the meeting.
  • Toggle Mic Button - This button will unmute or mute the mic.
  • Toggle Camera Button - This button will enable or disable the camera.

MeetingControls will accept 3 functions in the constructor.

  • onLeaveButtonPressed - invoked when the Leave button is pressed.
  • onToggleMicButtonPressed - invoked when the Toggle Mic button is pressed.
  • onToggleCameraButtonPressed - invoked when the Toggle Camera button pressed
  • onStartRecordingPressed - invoked when the Start Recording button is pressed.
  • onStopRecordingPressed - invoked when the Stop Recording button is pressed.
import 'package:flutter/material.dart';

class MeetingControls extends StatelessWidget {
  final void Function() onToggleMicButtonPressed;
  final void Function() onToggleCameraButtonPressed;
  final void Function() onLeaveButtonPressed;
  final void Function() onStartRecordingPressed;
  final void Function() onStopRecordingPressed;

  const MeetingControls(
      {super.key,
      required this.onToggleMicButtonPressed,
      required this.onToggleCameraButtonPressed,
      required this.onLeaveButtonPressed,
      required this.onStartRecordingPressed,
      required this.onStopRecordingPressed});

  
  Widget build(BuildContext context) {
    return Row(
      mainAxisAlignment: MainAxisAlignment.spaceEvenly,
      children: [
        ElevatedButton(
            onPressed: onLeaveButtonPressed, child: const Text('Leave')),
        ElevatedButton(
            onPressed: onToggleMicButtonPressed,
            child: const Text('Toggle Mic')),
        ElevatedButton(
            onPressed: onToggleCameraButtonPressed,
            child: const Text('Toggle WebCam')),
        ElevatedButton(
          onPressed: onStartRecordingPressed,
          child: const Text("Start Recording"),
        ),
        ElevatedButton(
          onPressed: onStopRecordingPressed,
          child: const Text("Stop recording"),
        ),
      ],
    );
  }
}
meeting_controls.dart

Step 4: Creating ParticipantTile

Let's create participant_tile.dart file and create ParticipantTile StatefulWidget.

The ParticipantTile will consist of:

  • RTCVideoView - This will show participant's video stream.

ParticipantTile will accept Participant in constructor

  • participant - participant of the meeting.
import 'package:flutter/material.dart';
import 'package:videosdk/videosdk.dart';

class ParticipantTile extends StatefulWidget {
  final Participant participant;
  const ParticipantTile({super.key, required this.participant});

  
  State<ParticipantTile> createState() => _ParticipantTileState();
}

class _ParticipantTileState extends State<ParticipantTile> {
  Stream? videoStream;

  
  void initState() {
    // initial video stream for the participant
    widget.participant.streams.forEach((key, Stream stream) {
      setState(() {
        if (stream.kind == 'video') {
          videoStream = stream;
        }
      });
    });
    _initStreamListeners();
    super.initState();
  }

  _initStreamListeners() {
    widget.participant.on(Events.streamEnabled, (Stream stream) {
      if (stream.kind == 'video') {
        setState(() => videoStream = stream);
      }
    });

    widget.participant.on(Events.streamDisabled, (Stream stream) {
      if (stream.kind == 'video') {
        setState(() => videoStream = null);
      }
    });
  }

  
  Widget build(BuildContext context) {
    return Padding(
      padding: const EdgeInsets.all(8.0),
      child: videoStream != null
          ? RTCVideoView(
              videoStream?.renderer as RTCVideoRenderer,
              objectFit: RTCVideoViewObjectFit.RTCVideoViewObjectFitContain,
            )
          : Container(
              color: Colors.grey.shade800,
              child: const Center(
                child: Icon(
                  Icons.person,
                  size: 100,
                ),
              ),
            ),
    );
  }
}
participant_tile.dart
Video SDK Image

Step 5: Creating the MeetingScreen & Configuring Transcription

In this step, we create meeting_screen.dart file and create MeetingScreen StatefulWidget. We set up the configuration for post-call transcription and summary generation. We define the webhook URL where the webhooks will be received.


We have introduced the setupRoomEventListener() function, which ensures that the post-transcription service automatically starts when recording the function startRecording begins and stops when stopRecording is called. This enhancement streamlines the transcription process, providing seamless integration with our recording feature.

import 'package:flutter/foundation.dart';
import 'package:flutter/material.dart';
import 'package:videosdk/videosdk.dart';
import './participant_tile.dart';
import 'meeting_controls.dart';

class MeetingScreen extends StatefulWidget {
  final String meetingId;
  final String token;

  MeetingScreen({
    Key? key,
    required this.meetingId,
    required this.token,
  }) : super(key: key);

  
  _MeetingScreenState createState() => _MeetingScreenState();
}

class _MeetingScreenState extends State<MeetingScreen> {
  late Room _room;
  var micEnabled = true;
  var camEnabled = true;

  Map<String, Participant> participants = {};

  
  void initState() {
    super.initState();

    // create room
    _room = VideoSDK.createRoom(
      roomId: widget.meetingId,
      token: widget.token,
      displayName: "John Doe",
      micEnabled: micEnabled,
      camEnabled: camEnabled,
      defaultCameraIndex:
          kIsWeb ? 0 : 1, // Index of MediaDevices for default camera
    );

    setMeetingEventListener();

    // Join room
    _room.join();
  }

  // Set up meeting event listeners
  void setMeetingEventListener() {
    _room.on(Events.roomJoined, () {
      setState(() {
        participants[_room.localParticipant.id] = _room.localParticipant;
      });
    });

    _room.on(Events.participantJoined, (Participant participant) {
      setState(() {
        participants[participant.id] = participant;
      });
    });

    _room.on(Events.participantLeft, (String participantId) {
      if (participants.containsKey(participantId)) {
        setState(() {
          participants.remove(participantId);
        });
      }
    });

    _room.on(Events.roomLeft, () {
      setState(() {
        participants.clear();
      });
      Navigator.popUntil(context, ModalRoute.withName('/'));
    });
  }

  // Handle back button press to leave the room
  Future<bool> _onWillPop() async {
    _room.leave();
    return true;
  }

  void setupRoomEventListener() { //Events for Recording
    _room.on(Events.recordingStateChanged, (String status) {
      //Status can be :: RECORDING_STARTING
      //Status can be :: RECORDING_STARTED
      //Status can be :: RECORDING_STOPPING
      //Status can be :: RECORDING_STOPPED
      print("Meeting Recording status : $status");
    });
  }

  Map<String, dynamic> config = {
    "layout": {
      "type": "GRID",
      "priority": "SPEAKER",
      "gridSize": 4,
    },
    "theme": "DARK",
    "mode": "video-and-audio",
    "quality": "high",
    "orientation": "portrait",
  };

  Map<String, dynamic> transcription = {
    "enabled": true,
    "summary": {
      "enabled": true,
      "prompt":
          "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary",
    }
  };

  
  Widget build(BuildContext context) {
    return WillPopScope(
      onWillPop: _onWillPop,
      child: Scaffold(
        appBar: AppBar(
          title: const Text('VideoSDK QuickStart'),
        ),
        body: Padding(
          padding: const EdgeInsets.all(8.0),
          child: Column(
            children: [
              Text(widget.meetingId),
              // Render all participants
              Expanded(
                child: Padding(
                  padding: const EdgeInsets.all(8.0),
                  child: GridView.builder(
                    gridDelegate:
                        const SliverGridDelegateWithFixedCrossAxisCount(
                      crossAxisCount: 2,
                      crossAxisSpacing: 10,
                      mainAxisSpacing: 10,
                      mainAxisExtent: 300,
                    ),
                    itemBuilder: (context, index) {
                      return ParticipantTile(
                        key: Key(participants.values.elementAt(index).id),
                        participant: participants.values.elementAt(index),
                      );
                    },
                    itemCount: participants.length,
                  ),
                ),
              ),
              MeetingControls(
                onToggleMicButtonPressed: () {
                  micEnabled ? _room.muteMic() : _room.unmuteMic();
                  setState(() {
                    micEnabled = !micEnabled;
                  });
                },
                onToggleCameraButtonPressed: () {
                  camEnabled ? _room.disableCam() : _room.enableCam();
                  setState(() {
                    camEnabled = !camEnabled;
                  });
                },
                onLeaveButtonPressed: () {
                  _room.leave();
                },
                onStartRecordingPressed: () {
                  _room.startRecording( 
                      config: config, transcription: transcription);
                  setupRoomEventListener();
                },
                onStopRecordingPressed: () {
                  _room.stopRecording();
                  setupRoomEventListener();
                },
              ),
            ],
          ),
        ),
      ),
    );
  }
}
meeting_screen.dart

Run and Test

Type flutter run to start your app

The app is all set to test. Make sure to update the token in api_call.dart

Your app should look like this after the implementation.

Output

Video SDK Image
Your Post-Call Transcription will Start once you start recording

Fetching the Transcription from the Dashboard

Once the transcription is ready, you can fetch it from the VideoSDK dashboard. The dashboard provides a user-friendly interface where you can view, download, and manage your Transcriptions & Summary.

Video SDK Image
To Access Transcription & Summary Files
If you get webrtc/webrtc.h file not found error at a runtime in ios then check solution here.

Conclusion

Integrating post-call transcription and summary features into your React application using VideoSDK provides significant advantages for capturing and documenting meeting content. This guide has meticulously detailed the steps required to set up and implement these features, ensuring that every conversation during a meeting is accurately transcribed and easily accessible for future reference.