Deep learning

These architectures have been applied to fields including computer vision, speech recognition, natural language processing, machine translation, bioinformatics, drug design, medical image analysis, climate science, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance.[33][34] Already in 1948, Alan Turing produced work on "Intelligent Machinery" that was not published in his lifetime,[35] containing "ideas related to artificial evolution and learning RNNs".He later published a 1962 book that also introduced variants and computer experiments, including a version with four-layer perceptrons "with adaptive preterminal networks" where the last two layers have learned weights (here he credits H. D. Block and B. W.[31] Subsequent developments in hardware and hyperparameter tunings have made end-to-end stochastic gradient descent the currently dominant training technique.The terminology "back-propagating errors" was actually introduced in 1962 by Rosenblatt,[37] but he did not know how to implement this, although Henry J. Kelley had a continuous precursor of backpropagation in 1960 in the context of control theory.[115] In 2011, a CNN named DanNet[116][117] by Dan Ciresan, Ueli Meier, Jonathan Masci, Luca Maria Gambardella, and Jürgen Schmidhuber achieved for the first time superhuman performance in a visual pattern recognition contest, outperforming traditional methods by a factor of 3.[3] In 2012, Andrew Ng and Jeff Dean created an FNN that learned to recognize higher-level concepts, such as cats, only from watching unlabeled images taken from YouTube videos.[120] In October 2012, AlexNet by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton[4] won the large-scale ImageNet competition by a significant margin over shallow machine learning methods.Results on commonly used evaluation sets such as TIMIT (ASR) and MNIST (image classification), as well as a range of large-vocabulary speech recognition tasks have steadily improved.[161] Finally, data can be augmented via methods such as cropping and rotating such that smaller training sets can be increased in size to reduce the chances of overfitting.[168] By 2019, graphics processing units (GPUs), often with AI-specific enhancements, had displaced CPUs as the dominant method for training large-scale commercial cloud AI .[174][175] Atomically thin semiconductors are considered promising for energy-efficient deep learning hardware where the same basic device structure is used for both logic operations and data storage.In 2020, Marega et al. published experiments with a large-area active channel material for developing logic-in-memory devices and circuits based on floating-gate field-effect transistors (FGFETs).[188] Another example is Facial Dysmorphology Novel Analysis (FDNA) used to analyze cases of human malformation connected to a large database of genetic syndromes.Closely related to the progress that has been made in image recognition is the increasing application of deep learning techniques to various visual art tasks.Using word embedding as an RNN input layer allows the network to parse sentences and phrases using an effective compositional vector grammar.[207][208] Research has explored use of deep learning to predict the biomolecular targets,[209][210] off-targets, and toxic effects of environmental chemicals in nutrients, household products and drugs.In practice, the probability distribution of Y is obtained by a Softmax layer with number of nodes that is equal to the alphabet size of Y. NJEE uses continuously differentiable activation functions, such that the conditions for the universal approximation theorem holds.[229] Deep learning has been shown to produce competitive results in medical application such as cancer cell classification, lesion detection, organ segmentation and image enhancement.[230][231] Modern deep learning tools demonstrate the high accuracy of detecting various diseases and the helpfulness of their use by specialists to improve the diagnosis efficiency.The use of AI and deep learning suggests the possibility of minimizing or eliminating manual lab experiments and allowing scientists to focus more on the design and analysis of unique compounds.PINNs leverage the power of deep learning while respecting the constraints imposed by the physical models, resulting in more accurate and reliable solutions for financial mathematics problems.[251] The clock uses information from 1000 CpG sites and predicts people with certain conditions older than healthy controls: IBD, frontotemporal dementia, ovarian cancer, obesity.Research psychologist Gary Marcus noted: Realistically, deep learning is only part of the larger challenge of building intelligent machines.[276] In further reference to the idea that artistic sensitivity might be inherent in relatively low levels of the cognitive hierarchy, a published series of graphic representations of the internal states of deep (20-30 layers) neural networks attempting to discern within essentially random data the images on which they were trained[277] demonstrate a visual appeal: the original research notice received well over 1,000 comments, and was the subject of what was for a time the most frequently accessed article on The Guardian's[278] website.[279] These issues may possibly be addressed by deep learning architectures that internally form states homologous to image-grammar[282] decompositions of observed entities and events.[279] Learning a grammar (visual or linguistic) from training data would be equivalent to restricting the system to commonsense reasoning that operates on concepts in terms of grammatical production rules and is a basic goal of both human language acquisition[283] and artificial intelligence (AI).[288] Another group showed that certain psychedelic spectacles could fool a facial recognition system into thinking ordinary people were celebrities, potentially allowing one person to impersonate another.[287] ANNs can however be further trained to detect attempts at deception, potentially leading attackers and defenders into an arms race similar to the kind that already defines the malware defense industry.[290] The philosopher Rainer Mühlhoff distinguishes five types of "machinic capture" of human microwork to generate training data: (1) gamification (the embedding of annotation or computation tasks in the flow of a game), (2) "trapping and tracking" (e.g. CAPTCHAs for image recognition or click-tracking on Google search results pages), (3) exploitation of social motivations (e.g. tagging faces on Facebook to obtain labeled facial images), (4) information mining (e.g. by leveraging quantified-self devices such as activity trackers) and (5) clickwork.
Representing images on multiple layers of abstraction in deep learning
Representing images on multiple layers of abstraction in deep learning [ 1 ]
How deep learning is a subset of machine learning and how machine learning is a subset of artificial intelligence (AI)
Richard Green explains how deep learning is used with a remotely operated vehicle in mussel aquaculture .
Visual art processing of Jimmy Wales in France, with the style of Munch's " The Scream " applied using neural style transfer
Deep Learning (South Park)Artificial intelligence (AI)Artificial general intelligenceIntelligent agentRecursive self-improvementPlanningComputer visionGeneral game playingKnowledge reasoningNatural language processingRoboticsAI safetyMachine learningSymbolicBayesian networksEvolutionary algorithmsHybrid intelligent systemsSystems integrationApplicationsBioinformaticsDeepfakeEarth sciencesGenerative AIGovernmentHealthcareMental healthIndustryTranslation Military PhysicsProjectsPhilosophyArtificial consciousnessChinese roomFriendly AIControl problemTakeoverEthicsExistential riskTuring testUncanny valleyHistoryTimelineProgressAI winterAI boomGlossaryneural networksclassificationregressionrepresentation learningbiological neuroscienceartificial neuronssupervisedsemi-supervisedunsupervisedfully connected networksdeep belief networksrecurrent neural networksconvolutional neural networksgenerative adversarial networkstransformersneural radiance fieldsspeech recognitionmachine translationdrug designmedical image analysisclimate scienceboard gamebiological systemshuman brainpropositional formulasgenerative modelsBoltzmann machinesalgorithmsimage recognitiontensorpixelsfeature engineeringdiscoversfeedforward neural networkgreedyRina DechterBooleanuniversal approximation theoremprobabilistic inferencefeedforward neural networkscontinuous functionsGeorge CybenkosigmoidKunihiko Fukushimarectified linear unitdeep neural networksLebesgue integrable functionprobabilisticoptimizationtrainingtestinggeneralizationcumulative distribution functiondropoutregularizerHopfieldWidrowNarendraBishopmultilayer perceptronWilhelm LenzErnst IsingIsing modelShun'ichi AmariJohn HopfieldAlan TuringFrank Rosenblattmultilayer perceptronsGroup method of data handlingAlexey Ivakhnenkostochastic gradient descentinternal representationsactivation functionNeocognitronBackpropagationchain ruleGottfried Wilhelm LeibnizHenry J. Kelleycontrol theorySeppo LinnainmaaPaul WerbosDavid E. Rumelharttime delay neural networkAlex WaibelYann LeCunrecognizing handwritten ZIP codesoptical computingcognitive psychologyJürgen Schmidhuberself-supervised learningpredictive codingdistillinglayersChatGPTSepp Hochreitervanishing gradient problemresiduallong short-term memoryzero-sum gamegenerative modelprobability distributiongradient descentTerry SejnowskiPeter DayanGeoffrey HintonBoltzmann machinerestricted Boltzmann machineHelmholtz machinewake-sleep algorithmprotein structure predictionmixture modelHidden Markov modelSRI Internationalspeaker recognitionLarry Heckfilter-bankMel-CepstralwaveformsGabor filterssupport vector machinesAlex Gravesconnectionist temporal classificationpattern recognitionhandwriting recognitionGeoff HintonRuslan Salakhutdinovfine-tunedMNIST imagesdecision treesAndrew NgGeForce GTX 280Luca Maria Gambardellamax-poolingJeff DeanYouTubeAlexNetAlex KrizhevskyIlya SutskeverImageNet competitionVGG-16Andrew ZissermanInceptionv3generating descriptionsresidual neural networkGoogle DeepDreamneural style transferVGG-19Generative adversarial networkIan GoodfellowNvidiaStyleGANdeepfakesDiffusion modelsDALL·E 2Stable DiffusionGoogle Voice Searchsmartphoneautomatic speech recognitionimage classificationYoshua BengioTuring AwardArtificial neural networkstarfishsea urchinsfeaturesfalse positiveconnectionistbiological neural networkslabeledrule-based programmingneuronsbiological brainsynapsereal numberssocial networkplaying board and video gamesprimitivesmultivariate polynomialslanguage modelingacoustic modelingoverfittingRegularizationweight decaysparsitylearning ratebatchingcerebellar model articulation controllercomputer hardwareOpenAIelectronic circuitsdeep learning processorsHuaweicloud computingtensor processing unitsGoogle Cloud PlatformCerebras Systemssemiconductorsfloating-gatefield-effect transistorsphotonichardware acceleratorwavelengthmultiplexingfrequency combsintegratedphotonicsdialectsAmerican EnglishbigramMulti-tasktransfer learningdomain knowledgeCortanaSkype TranslatorAmazon AlexaGoogle NowApple SiriiFlyTekNuanceremotely operated vehicleMNIST databaseThe Screamword embeddingword2vecvector spaceprobabilistic context free grammarsentiment analysisnamed-entity recognitionsentence embeddingGoogle TranslateGoogle Neural Machine Translation (GNMT)example-based machine translationDrug discoveryToxicologytoxic effectsbiomolecular targetsoff-targetsrational drug designEbola virusmultiple sclerosisgraph neural networksCustomer relationship managementDeep reinforcement learningdirect marketingcustomer lifetime valueRecommender systemautoencodergene ontologyelectronic health recordpredicting protein structureAlphaFoldstochastic processrandom variablesrandom variableclassifiervectormatrixSoftmaxalphabetactivation functionsconsistent estimatormobile advertisinginverse problemsdenoisingsuper-resolutioninpaintingfilm colorizationDeep Image Priorfraud detectionGoogle DeepMindLawrence Berkeley National Laboratorymaterials sciencecrystal structuresMaterials Projectpartial differential equationsNavier-Stokes equationsDeep backward stochastic differential equation methodBackward stochastic differential equationPhysics-informed neural networksEpigenetic clockbiochemical testCpG sitesfrontotemporal dementiaovarian cancerobesityInsilico Medicinebrain developmentcognitive neuroscientistsnerve growth factorself-organizationneocortextransducersFacebookautomatically tagging uploaded picturesDeepMind TechnologiesAlphaGoThe University of Texas at AustinU.S. Army Research LaboratoryExplainable artificial intelligenceblack boxstrong AIGary Marcuscausal relationshipslogical inferencesWatsonBayesian inferencedeductive reasoningThe GuardianGoertzelLearning a grammarcommonsense reasoningproduction rulesartificial intelligenceadversarial attackTinEyepsychedelicfacial recognition systemstop signsdeceptionmalwaregenetic algorithmclickworkAmazon Mechanical TurkmicroworkRainer MühlhoffgamificationCAPTCHAssearch results pagestagging facesinformation miningquantified-selfactivity trackersApplications of artificial intelligenceComparison of deep learning softwareCompressed sensingDifferentiable programmingEcho state networkList of artificial intelligence projectsLiquid state machineList of datasets for machine-learning researchReservoir computingSparse codingStochastic parrotTopological deep learningBibcodeCiteSeerXWayback MachineMathematics of Control, Signals, and SystemsSchmidhuber, JürgenRosenblatt, FrankRobbins, H.Amari, Shun'ichiKelley, Henry J.Linnainmaa, SeppoSchmidhuber, JuergenWerbos, PaulAlexander WaibelWikidataParallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: FoundationsPeter, DayanHinton, Geoffrey E.Neal, Radford M.Zemel, Richard S.Dayan, PeterFrey, Brendan J.Robinson, T.Graves, AlexHinton, G. E.Neural ComputationSainath, T.Sze, VivienneSainath, Tara N.Ramabhadran, BhuvanaRobinson, TonyMIT Technology ReviewHakkani-Tur, D.The Globe and MailJournal of Medical Internet ResearchSilver, DavidHuang, AjaSutskever, IlyaHassabis, DemisNatureGoodfellow, IanBengio, YoshuaParameterHyperparameterLoss functionsBias–variance tradeoffDouble descentClusteringQuasi-Newton methodConjugate gradient methodAttentionConvolutionNormalizationBatchnormActivationRectifierGatingWeight initializationDatasetsAugmentationPrompt engineeringReinforcement learningQ-learningImitationPolicy gradientDiffusionLatent diffusion modelAutoregressionAdversaryHallucinationLanguage modelLarge language modelWaveNetHuman image synthesisSpeech synthesisElevenLabsWhisperFacial recognitionText-to-image modelsAuroraDALL-EFireflyIdeogramMidjourneyText-to-video modelsDream MachineMusic generationSuno AISeq2seqChinchilla AIClaudeGeminichatbotProject DebaterIBM WatsonIBM WatsonxGranitePanGu-ΣDeepSeekAlphaZeroOpenAI FiveSelf-driving carMuZeroAction selectionAutoGPTRobot controlWarren Sturgis McCullochWalter PittsJohn von NeumannClaude ShannonMarvin MinskyJohn McCarthyNathaniel RochesterAllen NewellCliff ShawHerbert A. SimonOliver SelfridgeBernard WidrowJoseph WeizenbaumSeymour PapertLotfi A. ZadehStephen GrossbergFei-Fei LiDemis HassabisDavid SilverAndrej KarpathyNeural Turing machineDifferentiable neural computerTransformerVision transformer (ViT)Recurrent neural network (RNN)Long short-term memory (LSTM)Gated recurrent unit (GRU)Multilayer perceptron (MLP)Convolutional neural network (CNN)Residual neural network (RNN)Highway networkVariational autoencoder (VAE)Generative adversarial network (GAN)Graph neural network (GNN)Companies