Table of Contents

Unleash the Power of GeminiPro as Multi-turn Android App

Unleashing the power of GeminiPro in your android app – with text + images as an input and text as an output of Gemini by using GeminiPro as well as GeminiProVision models.

Introduction 

Nowadays, AI has transcended into our day-to-day life. Whether it’s to get clarity on our doubts or just getting more information about a particular subject, we all rely on AI in some way or the other. 

Google launched its generative AI chat bot ‘Gemini’ recently to generate content and solve problems. However, we know that accuracy of these AI models is not up to the mark, and it can generate misleading information. That’s why we should use it wisely.  

In this blog we will learn step-by-step to build simple android application with GeminiPro for multi-turn conversation with text + images as an input! Let’s start! 

Here’s demo: https://shorturl.at/hirOZ 

 

GeminiPro

 

Prerequisites 

Before we proceed let us see the initial requirements to go ahead: 

  • Android Studio Jellyfish [Canary build] version – 2023.3.1 Canary 9 or above installed (Current stable build does not support building GeminiApp). You can download latest canary builds using this link – https://developer.android.com/studio/preview/ 
  • A working Gmail email ID. 
  • Basic knowledge of Jetpack Compose. 

Setup and get API key of Gemini Pro 

Step 1: 

Go to https://aistudio.google.com/ [Please sign in into google account if are not already signed in!]. Accept the terms of services and you will see the following screen. 

 

Setup and get API key of Gemini Pro Step-1

 

Click on Get API Key to generate your API Key. 

Step 2: 

Create API key in new project

Click on “Create API key in new project”. 

Step 3: 

You will see your API key as shown in the below image. Copy that key, keep it safe and do not share your key with anyone. 

You will see your API key

Setup project on Android Studio 

Step 1: 

Open Android Studio Jellyfish [Canary build] version – 2023.3.1 Canary 9 or higher.  

 

Setup project on Android Studio

 

Click on New Project and you will see the screen below. 

New Project

Select template named ‘Gemini API Starter’ and click on Next.  

Step 2: 

Enter your desired name for the project and Package name

Enter your desired name for the project and Package name accordingly.  

Step 3: 

Now it will ask for API key of Gemini for setup. Enter the API key that we generated earlier and click on finish. 

Now it will ask for API key of Gemini for setup

 

Customize code as per our need 

Now Android Studio will download some required dependencies to setup the project as well as you can see that the project also has some predefined code to Summarize the text using Gemini. But we are not going to use that code so you can remove that code.  

We are going to build a conversational app which supports input in 2 formats [text + images].  

We will use two different generative models to achieve that. 

1. gemini-pro :

This model supports a multi-turn conversational approach which can store temporary history of ongoing conversion with user to generate new answer also based on / by taking reference of user’s previous input and its own earlier outputs. but it only supports text as a user input and doesn’t support image input!

2. gemini-pro-vision

This model supports both text and image input from user and generates text based on that but doesn’t support multi-turn conversational approach so it can’t store history of previous conversation with the user.  

So, to build a conversational app which also supports image inputs we’ll use combination of these generative models.

Step 1:

Create simple data class for message.

enum class Sender {
USER, MODEL, ERROR
}

data class Message(
val id: String = UUID.randomUUID().toString(),
var text: String = "",
val sender: Sender = Sender.USER,
var isPending: Boolean = false,
val imageUris: List = listOf()
)

Step 2:

We will use StateFlow for messages to keep our UI updated. For that, create a simple UiState class as below:

class ChatUiState( 
    messages: List = emptyList() 
) { 
    private val _messages: MutableList = messages.toMutableStateList() 
    val messages: List = _messages 
 
    fun addMessage(msg: Message) { 
        _messages.add(msg) 
    } 
 

//To update the pending status of message 
    fun replaceLastPendingMessage() { 
        val lastMessage = _messages.lastOrNull() 
        lastMessage?.let { 
            val newMessage = lastMessage.apply { isPending = false } 
            _messages.removeLast() 
            _messages.add(newMessage) 
        } 
    } 
} 

Step 3:

Finally, to interact with UI and the AI Model lets create ViewModel class as “ChatViewModel”. In this class we will also create instances of AI Model.

At first let’s create object of AI Model:

Extra: You can define safety settings as per your need which will prevent the model from generating explicit results. This is optional.

private val safetySetting = listOf( 
    SafetySetting( 
        harmCategory = HarmCategory.HARASSMENT, 
        threshold = BlockThreshold.MEDIUM_AND_ABOVE 
    ), 
    SafetySetting( 
        harmCategory = HarmCategory.HATE_SPEECH, 
        threshold = BlockThreshold.MEDIUM_AND_ABOVE 
    ), 
    SafetySetting( 
        harmCategory = HarmCategory.DANGEROUS_CONTENT, 
        threshold = BlockThreshold.LOW_AND_ABOVE 
    ), 
    SafetySetting( 
        harmCategory = HarmCategory.SEXUALLY_EXPLICIT, 
        threshold = BlockThreshold.LOW_AND_ABOVE 
    ) 
) 
val textModel = GenerativeModel( 
    modelName = "gemini-pro", 
    apiKey = BuildConfig.apiKey, safetySettings = safetySetting 
) 
val imageModel = GenerativeModel( 
    modelName = "gemini-pro-vision", 
    apiKey = BuildConfig.apiKey, safetySettings = safetySetting 
) 

To keep our UI updated and to start the chat with AI Model let’s create chat object and MutableStateFlow as below:

private val chat = textModel.startChat( 
    //Empty conversion history at initialization 
    history = listOf() 
) 
 
private val _uiState: MutableStateFlow = 
    MutableStateFlow(ChatUiState(chat.history.map { content -> 
        Message( 
            text = content.parts.first().asTextOrNull() ?: "", 
            sender = if (content.role == "user") Sender.USER else Sender.MODEL, 
            isPending = false 
        ) 
    })) 
val uiState: StateFlow = 
    _uiState.asStateFlow() 

To send messages to AI Model lets create sendMessage method as below:

fun sendMessage(userMessage: String /*text from user*/,  
                uris: List = listOf() /*URIs of selected images*/,  
                selectedImages: List = listOf() /*Bitmaps of selected images*/) { 
    _uiState.value.addMessage( 
        Message( 
            text = userMessage, 
            sender = Sender.USER, 
            isPending = true, 
            imageUris = uris 
        ) 
    ) 
 
    viewModelScope.launch { 
        try { 
            //Create a content to send with images[media] 
            val mediaContent = content { 
                for (bitmap in selectedImages) { 
                    image(bitmap) 
                } 
                text(userMessage) 
                role = Sender.USER.toString().lowercase() 
            } 
            //Create a content to send without images[media] 
            val content = content { 
                text(userMessage) 
                role = Sender.USER.toString().lowercase() 
            } 
             
            val response = if (selectedImages.isNotEmpty()) { 
                //Send content with media to imageModel and add it's data to textModel's history 
                val res = imageModel.generateContent(mediaContent) 
                chat.history.add(content) 
                chat.history.add(res.candidates.first().content) 
                res 
            } else { 
                //Send content without media to textModel [It manages history on it's own] 
                chat.sendMessage(content) 
            } 
 
            //Update the UI 
            _uiState.value.replaceLastPendingMessage() 
            response.text?.let { modelResponse -> 
                _uiState.value.addMessage( 
                    Message( 
                        text = modelResponse, 
                        sender = Sender.MODEL, 
                        isPending = false 
                    ) 
                ) 
            } 
        } catch (e: Exception) { 
            _uiState.value.replaceLastPendingMessage() 
            _uiState.value.addMessage( 
                Message( 
                    text = e.localizedMessage?:"Error occurred. Please try again", 
                    sender = Sender.ERROR 
                ) 
            ) 
        } 
    } 
} 

Step 4:

Next, let’s create a simple UI to interact with. We are going to use Compose UI components. Create a new file called ‘ChatScreen’ which will have object of ViewModel as constructor parameter as follow:

@Composable
internal fun ChatScreen(
    chatViewModel: ChatViewModel = ChatViewModel()
) {
    val chatUiState by chatViewModel.uiState.collectAsState()//UI State
    val listState = rememberLazyListState()//ListState
    //To get BITMAP from URIs of selected images and Load Image
    val coroutineScope = rememberCoroutineScope()
    val imageRequestBuilder = ImageRequest.Builder(LocalContext.current)
    val imageLoader = ImageLoader.Builder(LocalContext.current).build()
}

To display chat history:

@Composable 
fun ChatList( 
    messages: List, 
    listState: LazyListState 
) { 
    LazyColumn( 
        reverseLayout = true, 
        state = listState 
    ) { 
        items(messages.reversed()) { message -> 
            ChatBubbleItem(message) 
        } 
    } 
} 

@Composable 
fun ChatList( 
    messages: List, 
    listState: LazyListState 
) { 
    LazyColumn( 
        reverseLayout = true, 
        state = listState 
    ) { 
        items(messages.reversed()) { message -> 
            ChatBubbleItem(message) 
        } 
    } 
} 

For text input as well as select images:

@Composable 
fun MessageInput( 
    onSendMessage: (String, List) -> Unit, 
    resetScroll: () -> Unit = {} 
) { 
    var userMessage by rememberSaveable { mutableStateOf("") } 
    val imageUris = rememberSaveable(saver = UriSaver()) { mutableStateListOf() } 
 
    val pickMedia = rememberLauncherForActivityResult( 
        ActivityResultContracts.PickVisualMedia() 
    ) { imageUri -> 
        imageUri?.let { 
            imageUris.add(it) 
        } 
    } 
    ElevatedCard( 
        modifier = Modifier 
            .fillMaxWidth() 
            .shadow(elevation = 4.dp) 
    ) { 
        Row( 
            modifier = Modifier 
                .padding(vertical = 5.dp, horizontal = 10.dp) 
                .fillMaxWidth() 
        ) {//To select images 
            IconButton( 
                onClick = { 
                    pickMedia.launch( 
                        PickVisualMediaRequest(ActivityResultContracts.PickVisualMedia.ImageOnly)) 
                }, 
                modifier = Modifier.padding(all = 4.dp) 
                    .align(Alignment.CenterVertically) 
            ) { 
                Icon( 
                    Icons.Rounded.Add, contentDescription = "Add     Image")} 
            //Text input 
            OutlinedTextField( 
                value = userMessage, 
                label = { Text(stringResource(R.string.chat_label)) }, 
                placeholder = { Text(stringResource(R.string.summarize_hint)) }, 
                onValueChange = { userMessage = it }, 
                keyboardOptions = KeyboardOptions( 
                    capitalization = KeyboardCapitalization.Sentences, 
                ), 
                modifier = Modifier 
                    .align(Alignment.CenterVertically) 
                    .fillMaxWidth() 
                    .weight(0.90f) 
            ) 
            //Send message button 
            IconButton( 
                onClick = { 
                    if (userMessage.isNotBlank()) { 
                        onSendMessage(userMessage, imageUris.toList()) 
                        userMessage = "" 
                        imageUris.clear() 
                        resetScroll() 
                    } 
                }, 
                modifier = Modifier.padding(start = 12.dp) 
                    .align(Alignment.CenterVertically).fillMaxWidth() 
                    .weight(0.15f) 
            ) { 
                Icon( 
                    Icons.Default.Send, contentDescription = stringResource(R.string.action_send), modifier = Modifier) 
            } 
        } 
        //To display select images 
        LazyRow( 
            modifier = Modifier.padding(all = 8.dp) 
        ) { 
            items(imageUris) { imageUri -> 
                val showDialog = remember { mutableStateOf(false) } 
                if (showDialog.value) { 
                    Alert(showDialog = showDialog.value, 
                        onDismiss = { showDialog.value = false }) { 
                        showDialog.value = false 
                        imageUris.remove(imageUri) 
                    } 
                } 
                AsyncImage( 
                    model = imageUri, 
                    contentDescription = null, 
                    modifier = Modifier 
                        .padding(4.dp) 
                        .requiredSize(72.dp) 
                        .clickable { 
                            showDialog.value = true 
                        } 
                ) 
            } 
        } 
    } 
} 

//Remove image confirmation dialog 
@Composable 
fun Alert( 
    showDialog: Boolean, 
    onDismiss: () -> Unit, confirmAction: () -> Unit 
) { 
    if (showDialog) { 
        AlertDialog( 
            title = { 
                Text("Remove image") 
            }, 
            text = { 
                Text("Are you sure you want to remove this image?") 
            }, 
            onDismissRequest = onDismiss, 
            confirmButton = { 
                TextButton(onClick = confirmAction) { 
                    Text("Yes") 
                } 
            }, 
            dismissButton = { 
                TextButton(onClick = onDismiss) { 
                    Text("No") 
                } 
            } 
        ) 
    } 
} 
//To maintain select image URIs 
class UriSaver : Saver<MutableList, List> { 
    override fun restore(value: List): MutableList = value.map { 
        Uri.parse(it) 
    }.toMutableList() 
 
    override fun SaverScope.save(value: MutableList): List = 
        value.map { it.toString() } 
} 

You can use Scaffold to display the above UI components. Use MessageInput as bottomBar of scaffold and ChatList as main Component. To display toolbar, you can use customize topBar of Scaffold. To get Bitmaps from URI you can use following code:

coroutineScope.launch { 
    val bitmaps = selectedItems.mapNotNull { 
        val imageRequest = imageRequestBuilder 
            .data(it) 
            // Scale the image down to 768px for faster uploads 
            .size(size = 768) 
            .precision(Precision.EXACT) 
            .build() 
        try { 
            val result = imageLoader.execute(imageRequest) 
            if (result is SuccessResult) { 
                return@mapNotNull (result.drawable as BitmapDrawable).bitmap 
            } else { 
                return@mapNotNull null 
            } 
        } catch (e: Exception) { 
            return@mapNotNull null 
        } 
    } 

Finally call sendMessage of chatViewModel like this:

chatViewModel.sendMessage(inputText, selectedItems, bitmaps)

Now finally run the application on your device and witness the magic that you’ve just created!

I’ve created an SDK for plug-n-play usage. You can get full demo source code form here: https://github.com/terminator712/Gemini-Multi-turn-Chat

Conclusion 

So, we’ve created a simple that can communicate with Gemini API and get response of your query which can be either text or image or both. Although this does not have a cache mechanism or storage / database to store history of our chat.  

It is good to use all types of AI with proper safety settings to avoid explicit and inappropriate results.  

Also, as everyone knows, these kind of chat bots can generate wrong/partially wrong results sometimes. So, we should always use AI wisely and should not trust response of AI every time blindly! 

Picture of Vishal Shah

Vishal Shah

Vishal has 6+ years of experience as a Software Engineer working primarily on IoT based Android applications. He possesses knowledge of BLE, NFC, MQTT and other IoT based technologies as well as Various Android Jetpack Components. He has completed his graduation in BE from Gujarat Technological University.

Explore More

Talk to an Expert

Subscribe
to our Newsletter
Stay in the loop! Sign up for our newsletter & stay updated with the latest trends in technology and innovation.

Start a conversation today

Schedule a 30-minute consultation with our Industrial & Energy Solutions Experts

Start a conversation today

Schedule a 30-minute consultation with our Automotive Industry Experts

Start a conversation today

Schedule a 30-minute consultation with our experts

Please Fill Below Details and Get Sample Report

Reference Designs

Our Work

Innovate

Transform.

Scale

Partnerships

Device Partnerships
Digital Partnerships
Quality Partnerships
Silicon Partnerships

Company

Products & IPs