Musci Gen 使用及安装翻译手册

AI TOOLS3年前 (2023)更新 Prompt engineer

27,039 0 40

今天，我看到一篇文章，介绍Music Gen，看了一下发现AI生成音乐确实强大，竟然是开源有GIT仓的，Fouquet博士还专门录制了一个视频，怎么介绍这个Music Gen,使用及安装的方法。现在我就放上GIT地址，和模型地址。对照视频，和翻译应该很容易玩起来。

开源地址：https://github.com/facebookresearch/audiocraft

免费在线测试地址：https://huggingface.co/spaces/facebook/MusicGen

https://github.com/FurkanGozukara/Stable-Diffusion/blob/main/Tutorials/AI-Music-Generation-Audiocraft-Tutorial.md 安装步骤

Furuuen博士视频地址：

mpvideo.qpic.cn/0b2eqaaecaaakuaajqeifzsfbagdigaaaqia.f10002.mp4?dis_k=2e3396d5eb6aa8d673a943ac10eb8d33&dis_t=1686742235&play_scene=10120&auth_info=JuD8xdEJMmNui5PI2F8LYDcDJjo7HkBjO096NWdEd2AfagJXdA==&auth_key=1525323b04eaee1bfa82db09dad979f0&vid=wxv_2970575581421830145&format_id=10002&support_redirect=0&mmversion=false

内容全英文：

00:00

Greetings, everyone. FaceBook research has released audio craft, which can generate music from text prompts or from given audio files. Audio Craft is the best ever release music generator so far today. I will show you how to install and use it. I will begin with showing you some of the samples I have generated on my computer. I am very bad at music, so you can consider these are the worst generated samples. Now listen them together.

01:02

For generating this song, I have used this input text, and I didn’t use any middle of the condition. I use it. The large model with default parameters, the large model works very well on RTX. 3090. Probably it requires about 15. 16 GB. We are memory. However, if you have lower viram having GPU, you can use media model or small model as well.

01:28

Degenerated music files are not saved on your computer by default. So you need to create this three. That’s icon here and click download. And it will save the generated audio file into your Domus boulders. It is totally up to you to how you prompt get your output. I didn’t have much chance to test yet to find better examples. But still, this model is amazing. Now, in this example, I am showing you the combination of this prompt with back MP3 file that comes along with the model itself. Let’s listen to generated music.

02:39

I also use the Discretive comments song as an example as well. Let me also show you, how does it sound, And it is amazing. So first, I will let you listen. The melody conditioning.

03:07

And now let’s listen to generated music with taking this melody condition and also this input text.

03:44

As I said. This is another cherry picking. This is the first time generation, because I didn’t have much time to test and do more experimentation. So for installation, I have prepared an amazing GitHub Redmi file. This file link will be description of the video. This file will get updated as it is necessary, so you may find more information on this file. Why I am using such files, because I am producing a lot of AI content and depositories gets updated all the time and gets broken all the time so quickly. So I will keep this file up to date, and you will be able to always follow this video and install and use this open source library.

04:29

So there are two requirements that you need to do. First, you need to install Python. I suggest you to use Python, three point x version. I am Python three point 10.9, and it should be set in the pad as a default. So when you type item in your next and the window, you should see a message like this. The second thing that you need to have installed is Git. When you type. Git in your C, MD window, You should see a gift message like this. If you don’t know how to install them, I have an excellent tutorial video. The link is here and the dome up links are in here.

05:08

So now I will show the installation of audio craft on your computer. I also have preparation, auto install and run cigarette, which is shared on my Patreon post. You can download these secrets and directly use them. I will also show how to use them as well. So we will have a comparison. I have put every command here 1 by one. So it is very easy to follow if you don’t want to use my automated secrets.

05:34

So first enter inside the folder where you want to install your, I will make test three folder like this. I have entered inside folder. First, we will begin with colonic repository here. So open a new C, MD, wiggle in here. And this is where my C, M.

05:52

D window is, right click and start cloning. Then move into the colonnade powder, copy it, paste it, and move it folder. Then if you want to use the same version that I have used in this video, do get checkout. Do this only if you encounter problems and if it doesn’t work. Otherwise, you don’t need to do this.

06:12

You should use the, it’s a version of the repository because developers usually fix bugs and add new features, then we will generate our own mutual. This is really important because with this way, it won’t conflict with your other installations. Such as stable division. Copy, paste. And his answer, then we will activate the virtual environment. This is really important. You need to always work with activated virtual environments.

06:41

Then we will install talk, talk, vision, talk. Audio, copy, right click and hit answers. Meanwhile, this is installing. Let me show you the cigarettes that I have made. So for this, we will begin with cloning.

06:54

I will use test for folder to open any examination Cologne. Then cut the Donald secrets. Put them into the colonnade, main repository, which is audio craft. Paste them here. So you need to put these files into your colonel repository.

07:11

Into your cloud directory than just double click install, but file. It may ask you this, then just click run. Anyway, The bath file is very simple like this. You can also quote this if you wish, but you don’t have this bath pile, will install everything automatically for you if you don’t have a good GPU, don’t worry, there is a special Google Call up made for audio craft. I will also show you how to install and use audio craft on Google, Call up free account so we can continue our manual installation while our automatic installation is going on automatically.

07:46

After installing torque and torque, we just need to execute this command. I made this comment so easy for you. Just copy, paste and hit answer.

07:57

You see, as I do more research, more videos, I’m improving my skills and providing you better content so you don’t have to support me on Patreon. But if you support me on Patreon, I would appreciate it very much. We have our links in the top. You see, support me on Patreon, YouTube. You can also follow me on Twitter.

08:18

I am not a faceless person. I am Doctor Fouquet, and you can follow me. You can also connect me on my, I link it in as well. Also, don’t forget to join our Discord channel When you click this link, you will see our disclosure. You see, we have over 3000 members and we are growing. I am expecting you there as, okay, this step is also completed Now. We will install x forma. So copy paste. After you paste it, you need to hit enter. All right, now we are ready to start application, so I will close our manual installation and enter inside the folder. Open a new C, MD, then copy this comment and hit enter. And it will start a gradual interface.

09:03

Unfortunately, I couldn’t find a way to install Triton on Windows on the official repository of audio craft. I have opened several issue topics. You may see them here as well. I have asked, how can we install Triton, I have asked more information about top K, top P temperature and classifier pre guidance parameters. I also asked where we can get text token list. It was user to be trained on. So hopefully I will add more information to my read me file. So it started the URL. You need to copy this and open in your browser. So this is the inverter face that I have shown you in the beginning of the video.

09:47

Just type anything you want. For example, amazing grab song and select the model that you want to use. You can also use melody. And if you use melody, you need to drag and drop. And how the file here, So it will also use that condition.

10:03

There is also medium, small and large models, depending on your GPU viram, you can test them on artix. 3090. All of them works. And it is really fast.

10:13

So let’s generate this with large model. And I will keep video open while generating. So let me also show you the Vera usage. I also have several other applications open right now, so there is some extra little more Vram usage.

10:30

And video recording is also taking a lot of vitamin and GP power. However, it is working right now, as you are seeing. Moreover, when the first time you generate a song, it will download the models. Since I have downloaded already, it didn’t read a lot. But when the first time you generate. You will see Donald message like this, verge on the C, M. D. Window you launched at the usage. There is no messages because it is using the cashier with downloaded modern piles. I will also show you where these modeled piles are located.

11:06

Still generating our audio. It was like 6 seconds, So, and I am generating 30 seconds song. I try it to generate more than 30 seconds. However, it is not able to generate more than 30 seconds. Even if you wish. Okay, it took like 7 to seconds and you see, still I am recording. There are a lot of things going on right now. And it was really, really fast. Now, let’s listen it.

11:39

Okay, it looks like I didn’t even write the song properly. I have written some. So as you do more detailed input here, it will generate much better music. I will look for what kind of prompts we can do. So I’m not sure yet. This is like stable diffusion. You need to figure out proms. So let me show you where the model sales are saved. They are saved inside your sea, drive inside users inside your username in here, go to the cash folder in here, go to the hugging face in here, go to the hop and you will see models.

12:14

FaceBook, Music Chan. For example, large model, is taking 6 GB on my hard drive. Medium is taking 3 GB on my heart drive. 3.6, and melody model is 2.8. And small model is taking on the 1 GB on my heart, right, So for using condition, what you need to do is just click here, select the MPT file or another sound file and heat generate. Don’t forget to select model melody here. For example, let’s also try this with melody and let’s see how much time it will take. It says that it will take about 73 seconds, but I am not sure I can say that the FaceBook research is ahead of the other companies, such as Google.

13:00

Google announced music ML, but they didn’t release any models, anything to the public, so we couldn’t test them. We only saw their demos. However, in here.

13:11

We have something live that we can test. We can play with it. We can experiment with it. And this makes FaceBook much better in the AI realm. I hope I have pronounced it correctly in the AI realm. Yeah, and this is amazing.

13:29

I am following all of the AI needs, so keep subscriber to my channel, join our Discord. Hopefully if something new comes, I will make a video for it. I have a lot of backlog of new videos, knee tutorials. Even better, tourists will come, hopefully. Okay, it took 7 to 4 75. Yes, seventy-five, let’s listen.

14:21

I am pretty sure you will be able to compose much better music than, okay. Now, let me show you how to use these models, how to use audio craft on Google.

14:32

Call up for free. Just click the link I shared in this guitar REIT. My file, by the way, I have a guitar repository named a stable division. This is my main repository. Please start it for it. Watch it. I appreciated it helps me growing. I have many other tutorials and useful stuff here. Tricks here. I think you will like other content I share here as well. So I am open to Google, Call up. I will open in a knitted like this. This is our call. Up first, begin with connecting, click, connecting. It is connecting. This is a pretty simple secret. This secret made by Commandera. This guy is amazing. He is releasing how many Google collapse grips.

15:16

First, verify that you are connected to GPU. If you are not, change runtime from here, select GPU. If you are not able to select GPU, that means that your account is not verified. Phone number, very possibly, or you have used all of your GPU time, free time.

15:33

Then click play icon, run anyway, and just wait until it. Install everything and start the gradual link for us. Meanwhile, I will show you our automatic run cigarette that I have shared on the my Patreon post. It was already completed, the installation. You just double click the run that. Click more info. Click run. Anyway, this file is hundred percent safe because you can. And look what is inside. And it is justice. These are the just comments we executed. This script just automates it. And you see, you get your gradual link here.

16:10

Okay, our installation is going on on Google. Call up. You will also get this warning. Just ignore it. Don’t restart or don’t click this play icon again. You see, it has started gradual link, click it, and you will get a public gradual. This gradual is linked to this Google Call up runtime.

16:28

Let’s make a test with this one. So I click it. Let’s select a large model. I don’t know if large model may get out of Vita error on Google color. Please try it submit first. It will do not.

16:41

A large model on Google Call up. So this is running on cloud. Nothing here will affect your computer or will be downloaded onto your computer. Everything is, is Google servers. This is hundred percent safe. Let’s see. I wonder that if we will be able to use large model on Google, Call up free account. This is free account. Therefore, I have only 15 GB having GPR. Okay, so far we don’t have out of memory error. It is using 8 GB. I think it started processing.

17:12

We are waiting the results on Google Call up. It displays extra information like you are seeing right now. And we are at 11 GB GPU Ram. And we got a generated music, nice. Oh, very nice. Now let’s listen it.

17:59

Okay, for downloading the generated music files. You need to click this icon and it will download it onto your computer. By the way, on Google, Call up. It generated an MP4 file.

18:11

I will now try with a very long description prompt. Let’s see. Philip calls out of memory error or not, and how much time it will take, just. And let’s follow the GPU Ram usage. I wonder if the prompt length is affecting the user via memory amount. So you see, it is a huge problem. I generated it with Teddy P. T.

18:38

I am also monitoring the time it is going to take on my computer. It is usually taking about 60 seconds when the GPU. Is not much used. So on Google, Call up, we will see, by the way. On Windows, we weren’t fully utilizing the accelerators to a treat on library, which was missing on Windows. Google call up, It runs with eunuchs. Therefore, that is available, so it is more optimized than using this repository or window.

19:11

Okay, it was 90 seconds. 100 seconds, 120 seconds. Let’s look at the messages. 130, okay, about 130 seconds. The model was already loaded, so we didn’t wait or counted. Let’s listen.

19:57

Wow, this was epic. So you see. There is so much thing that you can do. I will test the same prompt on my computer to see whether there is any difference in generation time.

20:10

You see on Windows, we don’t have threatened. Therefore, some optimizations will not be enabled. But I think my GPU is still two times faster than what is on Google. Call up. So let’s open the interface type R text, large, modeled, select 30 seconds. By the way, if you use lesser duration, that may reduce your Vera usage. So let’s submit. Oh, first time it is loading the model. So I need to repeat the experimentation to ignore all the loading time. Meanwhile, I will also shut off my Google call up so it won’t use my GPU time. Just click here. Disconnect and delete. Run time, and it will delete everything and it will the Google collab.

20:52

Okay, 60 seconds. But it is also including the loading model time. Okay, it took like 96 seconds. Now I will submit again, by the way, each time you generate a new music. It will be different than previous one, depending on this top K top P temperatures and classifier free guidance variables. I ask them to the Chitty P T. And there are some information reading on this guitar video file that I will share in the video as a link in the description of the video and also in the comments section of the video. So you can read this and learn more about it. This is a general information based on the machine learning models. It is probably pretty accurate as well.

21:38

Okay, 40 seconds current. I am doing tests with only default parameters that the developers have said, but you can change them and see what kind of impact they are making. By the way, we are still recording video. Therefore, it is a little bit slower than what it should be around. 70 seconds in the right side, you. The last generation, it took to generate the music file. Okay, A D 5. Okay, looks like more prompt increases the time it takes to generate a song. Okay, yeah, it significantly increased at the time that it takes. And wow, this time it is taking even longer. It is maybe, probably because I am talking more. So, yeah. Okay, let’s listen this one.

22:55

I hope someone figures out, have to generate more than 30 seconds, because this, this is amazing. And for downloading, click these three icons and click on that. This is all for today. I hope you have enjoyed it.

23:06

Please support my Patreon. You can click here, you can connect with my link it in from here. You can follow me from Twitter from here. Please also subscribe. Leave a command shared, and please support me with joining on YouTube. I appreciate it very much. You will find the read the file link in the description of the video. Like here you see source GitHub file. This is from another view video about the same logic. And also in the Pinnacle command, you will also find the link of this Redmi.

23:35

File I shared. It is extremely important. I will keep it up to date, so if an error or something happens, I will write here based on your feedback.

23:45

If there are any other libraries that you need to install. I will write them here. Moreover, I have used Pip, please to list all of the installed libraries in my generated virtual environment. This is also logged, ridden in the very bottom of the rhythmical. So you can see all of the libraries with their versions. This is extremely useful because in future, when you watch this video, if you encounter a library error, you can see the version here and install a specific version. Now, for installing specific version, you need to use the following format, Peep installed, and the library that you want to install that equal, equal. And the version with this way, you can install the specific version of each library.

24:38

I hope you have enjoyed. Hopefully see you in another amazing tutorial.

我把视频内容翻译成如下：