Post-call Transcription & Summary in React-Native

Introduction

Post-call transcription and summary is a powerful feature provided by VideoSDK that allows users to generate detailed transcriptions and summaries of recorded meetings after they have concluded. This feature is particularly beneficial for capturing and documenting important information discussed during meetings, ensuring that nothing is missed and that there is a comprehensive record of the conversation.

How Post-Call Transcription Works?

Post-call transcription involves processing the recorded audio or video content of a meeting to produce a textual representation of the conversation. Here’s a step-by-step breakdown of how it works:

Recording the Meeting: During the meeting, the audio and video are recorded. This can include everything that was said and any shared content, such as presentations or screen shares.
Uploading the Recording: Once the meeting is over, the recorded file is uploaded to the VideoSDK platform. This can be done automatically or manually, depending on the configuration.
Transcription Processing: The uploaded recording is then processed by VideoSDK’s transcription engine. This engine uses advanced speech recognition technology to convert spoken words into written text.
Retrieving the Transcription: After the transcription process is complete, the textual representation of the meeting is made available. This text can be accessed via the VideoSDK API and used in various applications.

Benefits of Post-Call Transcription

Accurate Documentation: Provides a precise record of what was discussed, which is invaluable for meeting minutes, legal documentation, and reference.
Enhanced Accessibility: Makes content accessible to those who may have missed the meeting or have hearing impairments.
Easy Review and Analysis: Enables quick review of key points and decisions made during the meeting without having to re-watch the entire recording.

Let's Get started

VideoSDK empowers you to seamlessly integrate the video calling feature into your React application within minutes.

In this quickstart, you'll explore the group calling feature of VideoSDK. Follow the step-by-step guide to integrate it within your application.

Prerequisites

Node.js v12+
NPM v6+ (comes installed with newer Node versions)
Android Studio or Xcode installed

App Architecture

The App will contain two screens :

Join Screen : This screen allows users to either create a meeting or join a predefined meeting.
Meeting Screen : This screen contains a participant list and meeting controls, such as enabling/disabling the microphone and camera and leaving the meeting.

Getting Started with the Code!

Create App

Create a new React Native App using the below command.

npx react-native init AppName

For React Native setup, you can follow the Official Documentation.

VideoSDK Installation

Install the VideoSDK by using the following command. Ensure that you are in your project directory before running this command.

npm install "@videosdk.live/react-native-sdk"  "@videosdk.live/react-native-incallmanager"

Project Structure

  root
   ├── node_modules
   ├── android
   ├── ios
   ├── App.js
   ├── api.js
   ├── index.js

Project Configuration

Android Setup

Add the required permissions in the AndroidManifest.xml file.

AndroidManifest.xml

<manifest
  xmlns:android="http://schemas.android.com/apk/res/android"
  package="com.cool.app"
>
    <!-- Give all the required permissions to app -->
    <uses-permission android:name="android.permission.INTERNET" />
    <uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
    <!-- Needed to communicate with already-paired Bluetooth devices. (Legacy up to Android 11) -->
    <uses-permission
        android:name="android.permission.BLUETOOTH"
        android:maxSdkVersion="30" />
    <uses-permission
        android:name="android.permission.BLUETOOTH_ADMIN"
        android:maxSdkVersion="30" />

    <!-- Needed to communicate with already-paired Bluetooth devices. (Android 12 upwards)-->
    <uses-permission android:name="android.permission.BLUETOOTH_CONNECT" />

    <uses-permission android:name="android.permission.CAMERA" />
    <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
    <uses-permission android:name="android.permission.RECORD_AUDIO" />
    <uses-permission android:name="android.permission.SYSTEM_ALERT_WINDOW" />
    <uses-permission android:name="android.permission.FOREGROUND_SERVICE"/>
    <uses-permission android:name="android.permission.WAKE_LOCK" />

    <application>
   <meta-data
      android:name="live.videosdk.rnfgservice.notification_channel_name"
      android:value="Meeting Notification"
     />
    <meta-data
    android:name="live.videosdk.rnfgservice.notification_channel_description"
    android:value="Whenever meeting started notification will appear."
    />
    <meta-data
    android:name="live.videosdk.rnfgservice.notification_color"
    android:resource="@color/red"
    />
    <service android:name="live.videosdk.rnfgservice.ForegroundService" android:foregroundServiceType="mediaProjection"></service>
    <service android:name="live.videosdk.rnfgservice.ForegroundServiceTask"></service>
  </application>
</manifest>

AndroidManifest.xml

2. Update your colors.xml file for internal dependencies.

android/app/src/main/res/values/colors.xml

<resources>
  <item name="red" type="color">
    #FC0303
  </item>
  <integer-array name="androidcolors">
    <item>@color/red</item>
  </integer-array>
</resources>

3. Link the necessary VideoSDK Dependencies.

android/app/build.gradle

  dependencies {
   implementation project(':rnwebrtc')
   implementation project(':rnfgservice')
  }

android/settings.gradle

include ':rnwebrtc'
project(':rnwebrtc').projectDir = new File(rootProject.projectDir, '../node_modules/@videosdk.live/react-native-webrtc/android')

include ':rnfgservice'
project(':rnfgservice').projectDir = new File(rootProject.projectDir, '../node_modules/@videosdk.live/react-native-foreground-service/android')

MainApplication.java

import live.videosdk.rnwebrtc.WebRTCModulePackage;
import live.videosdk.rnfgservice.ForegroundServicePackage;

public class MainApplication extends Application implements ReactApplication {
  private static List<ReactPackage> getPackages() {
      @SuppressWarnings("UnnecessaryLocalVariable")
      List<ReactPackage> packages = new PackageList(this).getPackages();
      // Packages that cannot be autolinked yet can be added manually here, for example:

      packages.add(new ForegroundServicePackage());
      packages.add(new WebRTCModulePackage());

      return packages;
  }
}

android/gradle.properties

/* This one fixes a weird WebRTC runtime problem on some devices. */
android.enableDexingArtifactTransform.desugaring=false

4. Include the following line in your proguard-rules.pro file (optional: if you are using Proguard)

android/app/proguard-rules.pro

-keep class org.webrtc.** { *; }

5. In your build.gradle file, update the minimum OS/SDK version to 23.

buildscript {
  ext {
      minSdkVersion = 23
  }
}

iOS Setup

IMPORTANT: Ensure that you are using CocoaPods version 1.10 or later.

To update CocoaPods, you can reinstall the gem using the following command:

$ sudo gem install cocoapods

2. Manually link react-native-incall-manager (if it is not linked automatically).

Select Your_Xcode_Project/TARGETS/BuildSettings, in Header Search Paths, add "$(SRCROOT)/../node_modules/@videosdk.live/react-native-incall-manager/ios/RNInCallManager"

3. Change the path of react-native-webrtc using the following command:

Podfile

pod ‘react-native-webrtc’, :path => ‘../node_modules/@videosdk.live/react-native-webrtc’

4. Change the version of your platform.

You need to change the platform field in the Podfile to 12.0 or above because react-native-webrtc doesn't support iOS versions earlier than 12.0. Update the line: platform : ios, ‘12.0’.

5. Install pods.

After updating the version, you need to install the pods by running the following command:

Pod install

6. Declare permissions in Info.plist :

Add the following lines to your info.plist file located at (project folder/ios/projectname/info.plist):

ios/projectname/info.plist

<key>NSCameraUsageDescription</key><string>Camera permission description</string><key>NSMicrophoneUsageDescription</key><string>Microphone permission description</string>

Register Service

import { AppRegistry } from "react-native";
import App from "./App";
import { name as appName } from "./app.json";
import { register } from "@videosdk.live/react-native-sdk";

register();

AppRegistry.registerComponent(appName, () => App);

index.js

Step 1: Get started with api.js

Prior to moving on, you must create an API request to generate a unique meetingId. You will need an authentication token, which you can create either through the videosdk-rtc-api-server-examples or directly from the VideoSDK Dashboard for developers.

export const token = "<Generated-from-dashbaord>";
// API call to create meeting
export const createMeeting = async ({ token }) => {
  const res = await fetch(`https://api.videosdk.live/v2/rooms`, {
    method: "POST",
    headers: {
      authorization: `${token}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({}),
  });

  const { roomId } = await res.json();
  return roomId;
};

Step 2: Wireframe App.js with all the components

To build up a wireframe of App.js, you need to use VideoSDK Hooks and Context Providers. VideoSDK provides MeetingProvider, MeetingConsumer, useMeeting, and useParticipant hooks.

First, you need to understand Context Provider and Consumer. Context is primarily used when some data needs to be accessible by many components at different nesting levels.

MeetingProvider: This is the Context Provider. It accepts value config and token as props. The Provider component accepts a value prop to be passed to consuming components that are descendants of this Provider. One Provider can be connected to many consumers. Providers can be nested to override values deeper within the tree.
MeetingConsumer: This is the Context Consumer. All consumers that are descendants of a Provider will re-render whenever the Provider’s value prop changes.
useMeeting: This is the meeting hook API. It includes all the information related to meeting such as join, leave, enable/disable mic or webcam, etc.
useParticipant: This is the participant hook API. It is responsible for handling all the events and props related to one particular participant such as name, webcamStream, micStream etc.

The Meeting Context provides a way to listen for any changes that occur when a participant joins the meeting or makes modifications to their microphone, camera, and other settings.

Begin by making a few changes to the code in the App.js file.

import React, { useState, useEffect } from 'react';
import {
  SafeAreaView,
  TouchableOpacity,
  Text,
  TextInput,
  View,
  FlatList,
  StyleSheet,
  Modal,
} from 'react-native';
import {
  MeetingProvider,
  useMeeting,
  useParticipant,
  MediaStream,
  RTCView,
} from '@videosdk.live/react-native-sdk';
import { createMeeting, token } from './api';

function JoinScreen(props) {
  return null;
}

function ControlsContainer() {
  return null;
}

function MeetingView() {
  return null;
}

export default function App() {
  const [meetingId, setMeetingId] = useState(null);

  const getMeetingId = async (id) => {
    const meetingId = id == null ? await createMeeting({ token }) : id;
    setMeetingId(meetingId);
  };

  return meetingId ? (
    <SafeAreaView style={{ flex: 1, backgroundColor: "#F6F6FF" }}>
      <MeetingProvider
        config={{
          meetingId,
          micEnabled: false,
          webcamEnabled: true,
          name: "Test User",
        }}
        token={token}
      >
        <MeetingView />
      </MeetingProvider>
    </SafeAreaView>
  ) : (
    <JoinScreen getMeetingId={getMeetingId} />
  );
}

App.js

Step 3: Implement Join Screen

The join screen will serve as a medium to either schedule a new meeting or join an existing one.

function JoinScreen(props) {
  const [meetingVal, setMeetingVal] = useState("");
  return (
    <SafeAreaView
      style={{
        flex: 1,
        backgroundColor: "#F6F6FF",
        justifyContent: "center",
        paddingHorizontal: 6 * 10,
      }}
    >
      <TouchableOpacity
        onPress={() => {
          props.getMeetingId();
        }}
        style={{ backgroundColor: "#1178F8", padding: 12, borderRadius: 6 }}
      >
        <Text style={{ color: "white", alignSelf: "center", fontSize: 18 }}>
          Create Meeting
        </Text>
      </TouchableOpacity>

      <Text
        style={{
          alignSelf: "center",
          fontSize: 22,
          marginVertical: 16,
          fontStyle: "italic",
          color: "grey",
        }}
      >
        ---------- OR ----------
      </Text>
      <TextInput
        value={meetingVal}
        onChangeText={setMeetingVal}
        placeholder={"XXXX-XXXX-XXXX"}
        style={{
          padding: 12,
          borderWidth: 1,
          borderRadius: 6,
          fontStyle: "italic",
        }}
      />
      <TouchableOpacity
        style={{
          backgroundColor: "#1178F8",
          padding: 12,
          marginTop: 14,
          borderRadius: 6,
        }}
        onPress={() => {
          props.getMeetingId(meetingVal);
        }}
      >
        <Text style={{ color: "white", alignSelf: "center", fontSize: 18 }}>
          Join Meeting
        </Text>
      </TouchableOpacity>
    </SafeAreaView>
  );
}

App.js

Output

Step 4: Configuring Transcription and Implement Controls

In this step, we set up the configuration for post-transcription and summary generation. We define the webhook URL where the webhooks will be received.

In the startRecording function, we have passed the transcription object and the webhook URL, which will initiate the post-call transcription process.

Finally, when we call the stopRecording function, both the post-call transcription and the recording will be stopped.

The next step is to create a ControlsContainer component to manage features such as Join or Leaving Meetings and Enable or Disable Webcam/Mic.

In this step, the useMeeting hook is utilized to acquire all the required methods such as join(), leave(), toggleWebcam, toggleMic, startRecording and stopRecording.

const Button = ({ onPress, buttonText, backgroundColor }) => {
  return (
    <TouchableOpacity
      onPress={onPress}
      style={{
        backgroundColor: backgroundColor,
        justifyContent: 'center',
        alignItems: 'center',
        padding: 12,
        borderRadius: 4,
      }}>
      <Text style={{ color: 'white', fontSize: 12 }}>{buttonText}</Text>
    </TouchableOpacity>
  );
};

function ControlsContainer({ join, leave, toggleWebcam, toggleMic, startRecording, stopRecording }) {
  const [isJoined, setIsJoined] = useState(false);

  const handleJoin = () => {
    join();
    setIsJoined(true);
  }

  const handleLeave = () => {
    leave();
    setIsJoined(false)
  }

  const webhookUrl = "https://www.example.com";

  const transcription = {
    enabled: true, // Enables post transcription
    summary: {
      enabled: true, // Enables summary generation
  
      // Guides summary generation
      prompt:
        "Write summary in sections like Title, Agenda, Speakers, Action Items, Outlines, Notes and Summary",
    },
  };

  return (
    <SafeAreaView>

      <View
        style={{
          padding: 24,
          flexDirection: 'row',
          justifyContent: 'space-between',
        }}>
        <Button
          onPress={() => {
            handleJoin();
          }}
          buttonText={'Join'}
          backgroundColor={'#1178F8'}
        />
        <Button
          onPress={() => {
            toggleWebcam();
          }}
          buttonText={'Toggle Webcam'}
          backgroundColor={'#1178F8'}
          disabled={!isJoined}
        />
        <Button
          onPress={() => {
            toggleMic();
          }}
          buttonText={'Toggle Mic'}
          backgroundColor={'#1178F8'}
          disabled={!isJoined}
        />
        <Button
          onPress={() => {
            handleLeave();
          }}
          buttonText={'Leave'}
          backgroundColor={'#FF0000'}
          disabled={!isJoined}
        />
      </View>
      <View
        style={{
          padding: 24,
          flexDirection: 'row',
          justifyContent: 'space-evenly',
        }}>
        <Button
          onPress={() => {
            startRecording(webhookUrl, null, null, transcription);
          }}
          buttonText={'Start Recording'}
          backgroundColor={'#1178F8'}
          disabled={!isJoined}
        />
        <Button
          onPress={() => {
            stopRecording();
          }}
          buttonText={'Stop Recording'}
          backgroundColor={'#1178F8'}
          disabled={!isJoined}
        />
      </View>
    </SafeAreaView>
  );
}

App.js

function ParticipantList() {
  return null;
}
function MeetingView() {
  // Get `participants` from useMeeting Hook
  const [notificationVisible, setNotificationVisible] = useState(false);
  const [notificationMessage, setNotificationMessage] = useState('');
  const { join, leave, toggleWebcam, toggleMic, participants, meetingId, startRecording, stopRecording } = useMeeting({onRecordingStarted, onRecordingStopped});
  const participantsArrId = [...participants.keys()];

  useEffect(() => {
    if (notificationVisible) {
      const timer = setTimeout(() => {
        setNotificationVisible(false);
      }, 2000); // 2000 milliseconds = 2 seconds

      return () => clearTimeout(timer);
    }
  }, [notificationVisible]);

  function showNotification(message) {
    setNotificationMessage(message);
    setNotificationVisible(true);
  }

  function onRecordingStarted() {
    showNotification('Recording Started');
  }

  function onRecordingStopped() {
    showNotification('Recording Stopped');
  }

  return (
    <View style={{ flex: 1 }}>
      {meetingId ? (
        <Text style={{ fontSize: 18, padding: 12 }}>Meeting Id :{meetingId}</Text>
      ) : null}
      <ParticipantList participants={participantsArrId} />
      {notificationVisible && (
        <View style={styles.notificationContainer}>
          <Text style={styles.notificationText}>{notificationMessage}</Text>
        </View>
      )}
      <ControlsContainer
        join={join}
        leave={leave}
        toggleWebcam={toggleWebcam}
        toggleMic={toggleMic}
        startRecording={startRecording}
        stopRecording={stopRecording}
      />
    </View>
  );
}

const styles = StyleSheet.create({
  meetingIdText: {
    fontSize: 18,
    padding: 12,
  },
  notificationContainer: {
    position: 'absolute',
    top: 0,
    left: 0,
    right: 0,
    backgroundColor: 'rgba(0, 0, 0, 0.8)',
    paddingVertical: 10,
    paddingHorizontal: 20,
    justifyContent: 'center',
    alignItems: 'center',
  },
  notificationText: {
    color: '#FFFFFF',
    fontSize: 16,
    textAlign: 'center',
  },
});

App.js

Output

Step 5: Render Participant List

After implementing the controls, the next step is to render the joined participants.

You can get all the joined participants from the useMeeting Hook.

function ParticipantView() {
  return null;
}

function ParticipantList({ participants }) {
  return participants.length > 0 ? (
    <FlatList
      data={participants}
      renderItem={({ item }) => {
        return <ParticipantView participantId={item} />;
      }}
    />
  ) : (
    <View
      style={{
        flex: 1,
        backgroundColor: "#F6F6FF",
        justifyContent: "center",
        alignItems: "center",
      }}
    >
      <Text style={{ fontSize: 20 }}>Press Join button to enter meeting.</Text>
    </View>
  );
}

ParticipantList Component

Step 6: Handling Participant's Media

Before Handling the Participant's Media, you need to understand a couple of concepts.

1. useParticipant Hook

The useParticipant hook is responsible for handling all the properties and events of one particular participant joined in the meeting. It will take participantId as argument.

useParticipant Hook Example

const { webcamStream, webcamOn, displayName } = useParticipant(participantId);

2. MediaStream API

The MediaStream API is beneficial for adding a MediaTrack to the RTCView component, enabling the playback of audio or video.

useParticipant Hook Example

<RTCView
  streamURL={new MediaStream([webcamStream.track]).toURL()}
  objectFit={"cover"}
  style={{
    height: 300,
    marginVertical: 8,
    marginHorizontal: 8,
  }}
/>

Rendering Participant Media

function ParticipantView({ participantId }) {
  const { webcamStream, webcamOn } = useParticipant(participantId);

  return webcamOn && webcamStream ? (
    <RTCView
      streamURL={new MediaStream([webcamStream.track]).toURL()}
      objectFit={"cover"}
      style={{
        height: 300,
        marginVertical: 8,
        marginHorizontal: 8,
      }}
    />
  ) : (
    <View
      style={{
        backgroundColor: "grey",
        height: 300,
        justifyContent: "center",
        alignItems: "center",
      }}
    >
      <Text style={{ fontSize: 16 }}>NO MEDIA</Text>
    </View>
  );
}

App.js

Output

Run your application

npm run android // Android
npm run ios //  iOS

Output

Fetching the Transcription from the Dashboard

Once the transcription is ready, you can fetch it from the VideoSDK dashboard. The dashboard provides a user-friendly interface where you can view, download, and manage your Transcriptions & Summary.

Conclusion

Implementing post-call transcription and summary features in your React Native application using VideoSDK greatly enhances the functionality and user experience of your video conferencing tool. This detailed guide has provided you with the necessary steps to set up and configure the transcription service, from recording meetings to processing and retrieving transcriptions. By following this guide, you can ensure that all critical information discussed during meetings is accurately captured and easily accessible for future reference.

Post-call Transcription & Summary in React-Native

Video SDK Team

Introduction

How Post-Call Transcription Works?

Benefits of Post-Call Transcription

Let's Get started

Prerequisites

App Architecture

Getting Started with the Code!

Create App

VideoSDK Installation

Project Structure

Project Configuration

Android Setup

iOS Setup

Register Service

Step 1: Get started with api.js

Step 2: Wireframe App.js with all the components

Step 3: Implement Join Screen

Output

Step 4: Configuring Transcription and Implement Controls

Output

Step 5: Render Participant List

Step 6: Handling Participant's Media

1. useParticipant Hook

2. MediaStream API

Rendering Participant Media

Output

Run your application

Output

Fetching the Transcription from the Dashboard

Conclusion

Let’s build together

Free 10,000 mins every month | No credit card required to start