
An input audio data stream comprising speech is processed by an automatic censoring filter in either a real-time mode, or a batch mode, producing censored speech that has been altered so that undesired words or phrases are either unintelligible or inaudible. The automatic censoring filter employs a lattice comprising either phonemes and/or words derived from phonemes for comparison against corresponding phonemes or words included in undesired speech data. If the probability that a phoneme or word in the input audio data stream matches a corresponding phoneme or word in the undesired speech data is greater than a probability threshold, the input audio data stream is altered so that the undesired word or a phrase comprising a plurality of such words is unintelligible or inaudible. The censored speech can either be stored or made available to an audience in real-time.
In other words, if the filter matches a word to an undesirable word, it'll bleep it. It has applications in more than just radio, though. You could imagine it being used in video games, in perhaps a "family mode" in which any profanity uttered via voice chat is bleeped. |
This will not eliminate the delay. If the software doesn't listen for the beginning and end of each word, it's going to bleep a lot of incidental syllables. Take the way most people pronounce "suffocation" for instance. In fact, the delay will need to be even longer, if it's going to take context into account. I don't think DJs have anything to worry about in the near future. You can get anything you want by this system by making up new words not in the recognized phonemes which containing the old standards. I know, inconBLEEPingceivable, huh? |
|
Yeah, I was kind of wondering how it could really block real time accurately also. The word has to be said already for it know what it is. Maybe they mean real time by setting up a little buffer time before being sent out. |