Apple recently filed a patent application for a text-to-speech and a speech-to-text converter designed to work in noisy environments and the patent describes a system that uses a converter included on the logic board of the phone. his hardware-based conversion would have a distinct advantage over current text-to-speech systems and in a situation where text messaging is needed but for some reason the system could be activated on both ends with both speakers talking into the handset and their words being converted to text and sent to the other party………
Apple has filed a patent application for a system that could convert text to speech and vice versa on the iPhone and future iPhone‘s are likely to provide end users with effective new ways of communicating in both noisy environments like a restaurant or even during a quiet office meeting without stirring a mouse. The system involves using new text-to-speech and speech-to-text converters as well as providing a means of sending prerecorded notifications to the caller if you’re unable to speak when answering your phone.A smartphone user may sometimes have to make or answer a phone call in a noisy environment. Noise could interfere with a phone conversation to a degree that the conversation is no longer intelligible to either conversing party. A user in the noisy environment may try to scream into the phone over the noise, but the screaming and the noise may render the voice signal unintelligible at the other end.The user may not be able to shout loud enough into the phone to cover the noise in the restaurant. The user may not even be able to hear when the other end is talking. The noise may render the conversation unintelligible and may lead to a termination of the telephone conversation.
It may be inconvenient for a user to talk on a phone. For example, users may be in a meeting and don’t want to draw attention to themselves by speaking into the phone. The users may try to whisper into the phone, but the whispering may render the conversation unintelligible. The users may choose to send a text message to the other party, but the other party may be on a landline where texting is unavailable, or may not have a texting plan. It could be frustrating to conduct a telephone conversation when the environment is noisy or the circumstance is inappropriate for a user to speak.One embodiment of the invention is directed to an iPhone which establishes an audio connection with a far-end user via a communication network. The communication device receives text input from a near-end user and converts the text input into speech signals. The speech signals are transmitted to the far-end user using the established audio connection while muting audio input to its audio receiving component.In one embodiment, the communication device detects the noise level at the near end. When the noise level is above a threshold, the communication device could automatically activate or prompt the near-end user to activate text-to-speech conversion at any point of a communication such as a phone call. Alternatively, the communication device may playback a pre-recorded message to inform the far-end user of the near-end user’s inability to speak due to the excessive noise at the near end.
In another embodiment, the near-end user can activate text-to-speech conversion whenever necessary regardless of the detected noise level. The near-end user could enter a text message, which is converted into speech signals for transmission via the established audio connection to the far-end user.In yet another embodiment, the communication device could also perform speech-to-text conversion to convert the far-end user’s speech into text for display on the communication device. This feature could be used when the far-end communication device cannot, or is not enabled to, send or receive text messages. The speech-to-text conversion and the text-to-speech conversion could be activated at the same time, or could be activated independent of each other. The far-end communication device communicates with the near-end communication device in audio signals, regardless of whether the speech-to-text conversion or the text-to-speech conversion is activated.Apple’s patent FIG. 1 is a diagram illustrating a communication environment in which a near-end communication device (e.g., a near-end phone 100) is engaged in, or about to be engaged in, a communication (e.g., phone call) with a far-end communication device (e.g., a far-end phone 98) via a communication network (e.g., wireless network 120). The term “communication device” broadly refers to various real-time communication devices, e.g., landline telephone system (POTS) end stations, voice-over-IP end stations, cellular handsets, smart phones, computing devices, etc.
In one embodiment, the microphone (113) could be used to monitor the noise level in the environment surrounding the near-end phone 100. In an alternative embodiment, a separate microphone could be used to monitor the environmental noise. A noise meter (152) may be shown on the display screen to indicate the detected noise level. The noise meter may be shown when a phone call is made or received, when the noise level reaches the vicinity of a pre-determined threshold, or as long as the near-end phone is powered on. The noise level may be indicated by the noise meter by colors, numeral values, height or length of a bar indicator, etc.In response to the detection of the relative or particular noise level at the near end, the near-end phone displays a number of options for the user to choose. The options may include: text-to-speech, two-way text, play (pre-recorded) message, and voicemail. The user may select one of these options using a physical button or a virtual button. In one embodiment, the near-end phone also displays the noise meter on the display screen to provide a visual indication of the environmental noise level at the near-end.
If the near-end user selects the text-to-speech option, the display may show “TEXT TO SPEECH” to indicate that the text-to-speech conversion has been activated. The near-end user may use a physical keyboard or a virtual keyboard to input text messages. The display also shows an outgoing message area that displays the text entered by the near-end user. As the near-end user inputs the text, the text-to-speech converter automatically converts the text into speech. The near-end phone transmits the converted speech signal to the far-end user, utilizing the audio connection that has already been established between the near-end user and the far-end user.If he user selects the two-way text option, the display may show “TWO-WAY TEXT” to indicate that both of the text-to-speech and speech-to-text conversions have been activated. The near-end user may use a physical keyboard or a virtual keyboard to input text messages. The display shows an incoming message area 612 for displaying the text converted from the far-end user, and an outgoing message area 613 for displaying the text entered by the near-end user. The established audio connection carries two-way voice signals between the near-end and the far-end users. The conversions from text to speech and from speech to text are performed by the near-end phone. The far-end user could speak to the far-end phone in the same way as in a normal telephone conversation that does not involve text messages.
A secondary iPhone patent published today relates to the one noted above in respect to a new incoming call hold mechanism. According to Apple, the iPhone will hold an incoming call for a user when the user is temporarily unavailable to pick up the call. In response to an incoming call signal and an indication from the user to hold the call, the iPhone will answer the call and play back a pre-recorded message to the caller while holding the call.The call could also be held until the user picks up the call. If the user is on another call when the incoming call arrives, the iPhone answers the incoming call with a pre-recorded message and holds the incoming call, as well as concurrently maintaining uninterrupted communication on the in-progress call. The user could also enter an estimated hold time, which is announced to the caller.