VALL-E

VALL-E is a generative artificial intelligence system for speech synthesis developed by Microsoft Research and announced on January 5, 2023.[1] It can "recreate any voice from a three-second sample clip".[2] It has been trained on 60,000 hours of English language speech from Meta’s audio library LibriLight.[3] This artificial intelligence-related article is a stub.You can help Wikipedia by expanding it.This article about software created, produced or developed by Microsoft is a stub.

Developer(s)MicrosoftPlatformCloud computing platformsMachine learningdata miningSupervised learningUnsupervised learningSemi-supervised learningSelf-supervised learningReinforcement learningMeta-learningOnline learningBatch learningCurriculum learningRule-based learningNeuro-symbolic AINeuromorphic engineeringQuantum machine learningClassificationGenerative modelingRegressionClusteringDimensionality reductionDensity estimationAnomaly detectionData cleaningAutoMLAssociation rulesSemantic analysisStructured predictionFeature engineeringFeature learningLearning to rankGrammar inductionOntology learningMultimodal learningApprenticeship learningDecision treesEnsemblesBaggingBoostingRandom forestLinear regressionNaive BayesArtificial neural networksLogistic regressionPerceptronRelevance vector machine (RVM)Support vector machine (SVM)Hierarchicalk-meansExpectation–maximization (EM)DBSCANOPTICSMean shiftFactor analysisGraphical modelsBayes netConditional random fieldHidden MarkovRANSACLocal outlier factorIsolation forestArtificial neural networkAutoencoderDeep learningFeedforward neural networkRecurrent neural networkreservoir computingBoltzmann machineRestrictedDiffusion modelConvolutional neural networkAlexNetDeepDreamNeural radiance fieldTransformerVisionSpiking neural networkMemtransistorElectrochemical RAMQ-learningTemporal difference (TD)Multi-agentSelf-playActive learningCrowdsourcingHuman-in-the-loopCoefficient of determinationConfusion matrixLearning curveROC curveKernel machinesBias–variance tradeoffComputational learning theoryEmpirical risk minimizationOccam learningPAC learningStatistical learningVC theoryTopological deep learningECML PKDDNeurIPSGlossary of artificial intelligenceList of datasets for machine-learning researchList of datasets in computer vision and image processingOutline of machine learninggenerative artificial intelligencespeech synthesisMicrosoft ResearchAmazon PollyAudio deepfakeComparison of speech synthesizersDeep learning speech synthesisNatural language generationSpeechifyVoice phishingZero-shot learningartificial intelligenceFree softwareeSpeakeSpeakNGGnopernicusGnuspeechFestival Speech Synthesis SystemFreeTTSAutomatik Text ReaderRetrieval-based Voice ConversioneCantorixLyricos / FlingerProprietary softwareDECtalkSoftware Automatic MouthTalk It!Microsoft AgentMicrosoft Speech APIMicrosoft text-to-speech voicesReadspeakerVoice browserCoolSpeechCereProcCeVIO Creative StudioVoiceroidLaLaVoiceElevenLabsAlter/EgoCantorChipspeechNIAONiao Virtual SingerPPG PhonemSymphonic ChoirsVocalinaVocaloidEcho IIMockingboardPattern playbackPhasorTexas Instruments LPC Speech ChipsGeneral Instrument SP0256AOLbyPhoneDialogOSDr. SbaitsoMBROLAWindows NarratorMicrosoft Speech ServerPlainTalkVoice fontSpeech Synthesis Markup LanguageVoiceXMLAlan W. BlackCatherine BrowmanFranklin Seaney CooperGunnar FantHaskins LaboratoriesWolfgang von KempelenIgnatius MattinglyPhilip RubinYamahaArticulatory synthesisConcatenative synthesisCurrahInverse filterPhase vocoderSelf-voicingVoice cloning