phrame

phrame通过聆听周围的对话，将口语和情感转变为视觉上令人惊叹的杰作，从而产生迷人和独特的艺术。释放您的创造力并改变周围的音景。

如何

phrame依靠网络语音API的语音认识接口将音频转换为文本。该文本由Openai处理，并产生凝结的摘要。然后将摘要与配置的生成AI映像服务结合使用，并保存最终图像。

捐赠

如果您想捐款以支持开发，请使用GitHub赞助商。

特征

通过口语对话创建独特的AI生成的艺术品
自动，手动或声激活按需艺术的摘要生成
用户友好的UI，针对台式机和移动设备进行了优化
实时更新和通过Webockets的远程控制
自定义的集成配置编辑器
支持多个生成AI图像服务
图像生成和导航的语音命令
毫不费力地管理画廊：浏览，最喜欢的，删除图像和使用键盘快捷键导航
访问和管理日志以进行有效的故障排除

支持的架构

AMD64
ARM64

支持AIS

Openai
Midjourney*
稳定性AI
梦
Deepai
莱昂纳多

* Midjourney目前使用非正式的第三方套餐。使用此集成以您自己的风险。

语音命令

使用以下语音命令激活麦克风以与phrame相互作用。

命令	行动
嘿， phrame	唤醒词按需生成图像
下一个图像	前进到下一个图像
上图	前进到以前的图像
最后一个图像	前进到以前的图像

UI

phrame在Localhost上具有响应式UI：3000。

小路	姓名
/	控制器
/ phrame ？麦克风	带有麦克风支撑的phrame
/ phrame	无麦克风支撑的phrame
/画廊	画廊
/config	config
/日志	日志

隐私

phrame中的语音识别由浏览器管理。语音识别的音频数据的处理取决于所使用的特定浏览器。例如，Chrome拿起音频并将其发送到Google的服务器以执行转录。鼓励审查所选浏览器的隐私政策，以充分了解语音数据的处理方式。

转录后， phrame将这些转录保存到本地数据库中。然后，通过OpenAI处理它们以生成摘要，然后立即删除了原始转录。该摘要与配置的生成AI映像服务一起使用，最终的艺术品在当地保存。

重要的是要澄清，除了通过OpenAI生成摘要所需的短期外， phrame不能保留或将您的成绩单传输到本地设备之外。除这些特定情况外，没有用于任何其他目的，都没有使用，存储或传输个人数据。

用法

phrame作为单个Docker容器可运行，即使没有麦克风，也可以使用任何现代浏览器轻松访问。

为了利用语音识别功能，需要兼容的浏览器和麦克风。目前，Chrome和Safari是唯一支持语音识别的浏览器。

根据image.order值显示phrame中的艺术品。最新的摘要和任何喜欢的图像都是无缝合并的，提供了不断发展的AI生成艺术的帆布。随着新图像的创建，它们会立即通过phrame显示。

快速开始

启动phrame
转到Localhost：3000/config
1. 添加您的OpenAI API键并保存
2. 验证使用绿色圆圈配置为OpenAI显示
在新窗口中，请访问localhost：3000/ phrame ？麦克风并按照屏幕上的说明进行操作
转到Local主机：3000并验证麦克风和语音识别正在起作用

Docker Run

phrame :/.storage -p 3000:3000 jakowenko/ phrame \”>

docker run -d --restart=unless-stopped --name= phrame -v phrame :/.storage -p 3000:3000 jakowenko/ phrame

Docker组成

phrame:

services:
phrame :
container_name: phrame
image: jakowenko/ phrame
restart: unless-stopped
volumes:
– phrame :/.storage
ports:
– 3000:3000\”>

 version : \' 3.9 \'

volumes :
  phrame :

services :
  phrame :
    container_name : phrame
    image : jakowenko/ phrame
    restart : unless-stopped
    volumes :
      - phrame :/.storage
    ports :
      - 3000:3000

启动启动

现代浏览器需要用户单击以访问麦克风。要自动phrame启动时，您可以使用以下脚本。这需要安装YDOTOOL或XDOTOOL（取决于显示服务器），这使您可以模拟键盘输入和鼠标活动。

该脚本将等待15秒钟的码头引擎和phrame才能启动Chrome。您可以通过更改睡眠值来调整延迟。启动浏览器后，脚本将等待5秒钟，然后发送单击以获取麦克风访问并开始语音识别。

根据您的系统，您可能需要调整通往铬的路径。

ydotool

phrame to load
sleep 5s

# move the mouse to the coordinates and click the left mouse button
ydotool mousemove –absolute 0 0
ydotool click 0xC0\”>

 #! /bin/bash

export YDOTOOL_SOCKET=/tmp/.ydotool_socket

# wait for the desktop and docker to be fully loaded
sleep 15s

# launch chrome in kiosk mode for microphone access
/usr/bin/google-chrome-stable --kiosk --no-first-run --hide-crash-restore-bubble --password-store=basic \" http://local*ho**st:3000/phrame?mic \" &

# wait for chrome and phrame to load
sleep 5s

# move the mouse to the coordinates and click the left mouse button
ydotool mousemove --absolute 0 0
ydotool click 0xC0

xdotool

phrame to load
sleep 5s

# move the mouse to the coordinates and click the left mouse button
xdotool mousemove –sync 0 0 click 1\”>

 #! /bin/bash

# wait for the desktop and docker to be fully loaded
sleep 15s

# launch chrome in kiosk mode for microphone access
/usr/bin/google-chrome-stable --kiosk --no-first-run --hide-crash-restore-bubble --password-store=basic \" http://local*ho**st:3000/phrame?mic \" &

# wait for chrome and phrame to load
sleep 5s

# move the mouse to the coordinates and click the left mouse button
xdotool mousemove --sync 0 0 click 1

配置

可配置的选项保存到/.storage/config/config.yml，并可以通过ui localhost的UI进行编辑：3000/config。

注意：除非需要覆盖它们，否则不需要在配置中指定默认值。

图像

 # image settings (default: shown below)

image :
  # time in seconds between image transitions
  interval : 60
  # order of images to display: random, recent
  order : recent

自动基因

图像可以通过创建随机摘要自动生成。这可以通过cron表达式进行安排。可以通过关键字来帮助指导摘要。

 # autogen settings (default: shown below)

autogen :
  # schedule as a cron expression for processing transcripts (at every 15th and 45th minute)
  cron : \' 15,45 * * * * \'
  prompt : Provide a random short description to describe a picture. It should be no more than one or two sentences. If keywords are provided select a couple at random to help guide the description.
  # keywords to guide the summary
  keywords : []

成绩单

图像是通过处理成绩单生成的。这可以通过cron表达式进行安排。然后，使用OpenAi.Summary.prompt来总结成绩单，将在X分钟内完成所有成绩单。

 # transcript settings (default: shown below)

transcript :
  # schedule as a cron expression for processing transcripts (at every 30th minute)
  cron : \' */30 * * * * \'
  # how many minutes of files to look back for (process the last 30 minutes of transcripts)
  minutes : 30
  # minimum number of transcripts required to process
  minimum : 5

Openai

要配置OpenAI，请获取一个API键，然后将其添加到您的配置中。还将应用所有其他默认设置，也将应用。您可以通过更新config.yml文件来覆盖设置。

 # openai settings (default: shown below)

openai :
  # api key
  key :

  summary :
    # model name (https://platform.op**e*nai.com/docs/models/overview)
    model : gpt-3.5-turbo
    # prompt used to generate a summary from transcripts
    prompt : You will be given a string of random conversations and need to pull out a few keywords and topics that were talked about. You will then turn this into a short description to describe a picture. It should be no more than two or three sentences.
    # prompt used to generate a random summary
    random : Provide a random short description to describe a picture. It should be no more than two or three sentences.

  image :
    # enable or disable image generation
    enable : true
    # trim letterbox and pillarbox images
    trim : false
    # size of the generated images: 256x256, 512x512, or 1024x1024
    size : 512x512
    # number of images to generate for each style
    n : 1
    # used with summary to guide the image model towards a particular style
    style :
      - cinematic

Midjourney

Midjourney目前使用非正式的第三方套餐。使用此集成以您自己的风险。

要配置MidJourney，您将需要以下内容：

Discord Server ID和频道ID
- 通过在浏览器中访问您的Discord频道，该浏览器应遵循此模式-https：//discord.com/channels/server_id/channel_id获取
邀请Midjourney Bot到您的服务器
虽然没有必要，但也建议使用拥抱的脸代币作为安全提示

还将应用所有其他默认设置，也将应用。您可以通过更新config.yml文件来覆盖设置。

 # midjourney settings (default: shown below)

midjourney :
  # discord server id
  server_id :
  # discord channel id
  channel_id :
  # discord token (https://linux*hi*n*t.com/get-discord-token)
  token :
  # hugging face token (https://h**uggi*ngface.co/docs/hub/security-tokens)
  hugging_face_token :

  image :
    # enable or disable image generation
    enable : true
    # trim letterbox and pillarbox images
    trim : false
    # options added to a prompt that change how an image generates (https://docs.midjo**u*rney.com/docs/parameter-list)
    parameters : --chaos 80 --no text
    # upscale options (false, random, 1,2,3,4)
    upscale : random
    # used with summary to guide the image model towards a particular style
    style :
      - cinematic

SteStieAi

要配置稳定性AI，请获取API键并将其添加到您的配置中。还将应用所有其他默认设置，也将应用。您可以通过更新config.yml文件来覆盖设置。

 # stabilityai settings (default: shown below)

stabilityai :
  # api key
  key :

  image :
    # enable or disable image generation
    enable : true
    # trim letterbox and pillarbox images
    trim : false
    # number of seconds before the request times out and is aborted
    timeout : 30
    # engined used for image generation
    engine_id : stable-diffusion-512-v2-1
    # width of the image in pixels, must be in increments of 64
    width : 512
    # height of the image in pixels, must be in increments of 64
    height : 512
    # how strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt)
    cfg_scale : 7
    # number of images to generate for each style
    samples : 1
    # number of diffusion steps to run
    steps : 50
    # image model style (https://platform.stab*ili*t*y.ai/rest-api#tag/v1generation/operation/textToImage)
    style :
      - cinematic

Deepai

要配置DeepAi，请获取一个API键，然后将其添加到您的配置中。还将应用所有其他默认设置，也将应用。您可以通过更新config.yml文件来覆盖设置。

 # deepai settings (default: shown below)

deepai :
  # api key
  key :

  image :
    # enable or disable image generation
    enable : true
    # trim letterbox and pillarbox images
    trim : false
    # number of seconds before the request times out and is aborted
    timeout : 30
    # 1 returns one image and 2 returns four images
    grid_size : 1
    # width of the image in pixels, between 128 and 1536
    width : 512
    # height of the image in pixels, between 128 and 1536
    height : 512
    # indicate what you want to be removed from the image
    negative_prompt :
    # image model style (https://*deepai.or**g/machine-learning-model/text2img)
    style :
      - text2img

梦

要配置梦想，请获取一个API键，然后将其添加到您的配置中。还将应用所有其他默认设置，也将应用。您可以通过更新config.yml文件来覆盖设置。

 # dream settings (default: shown below)

dream :
  # api key
  key :

  image :
    # enable or disable image generation
    enable : true
    # trim letterbox and pillarbox images
    trim : false
    # number of seconds before the request times out and is aborted
    timeout : 30
    # width of the image in pixels
    width : 512
    # height of the image in pixels
    height : 512
    # image model style (https://api.**l*uan.tools/api/styles)
    style :
      - buliojourney v2

伦纳多伊

要配置Leonardo.ai，请获取一个API键，然后将其添加到您的配置中。还将应用所有其他默认设置，也将应用。您可以通过更新config.yml文件来覆盖设置。

 # leonardoai settings (default: shown below)

leonardoai :
  # api key
  key :

  image :
    # enable or disable image generation
    enable : true
    # trim letterbox and pillarbox images
    trim : false
    # number of seconds before the request times out and is aborted
    timeout : 30
    # indicate what you want to be removed from the image
    negative_prompt :
    # model id used for the image generation, if not provided uses sd_version to determine the version of stable diffusion to use
    model_id : 6bef9f1b-29cb-40c7-b9df-32b51c1f67d3
    # base version of stable diffusion to use if not using a custom model
    sd_version : v2
    # number of images to generate for each style
    num_images : 1
    # width of the image in pixels, must be between 32 and 1024 and be a multiple of 8
    width : 512
    # height of the image in pixels, must be between 32 and 1024 and be a multiple of 8
    height : 512
    # number of inference steps to use for the generation, must be between 30 and 60
    num_inference_steps :
    # how strongly the generation should reflect the prompt, must be between 1 and 20.
    guidance_scale : 7
    # scheduler to generate images with
    scheduler :
    # style to generate images with
    preset_style : LEONARDO
    # whether the generated images should tile on all axis
    tiling :
    # whether the generated images should show in the community feed
    public :
    # enable to use prompt magic
    prompt_magic :
    # used with summary to guide the image model towards a particular style
    style :
      - cinematic

时间

 # time settings (default: shown below)

time :
  # defaults to iso 8601 format with support for token-based formatting
  # https://g*ith**ub.com/moment/luxon/blob/master/docs/formatting.md#table-of-tokens
  format :
  # time zone used in logs
  timezone : UTC

日志

 # log settings (default: shown below)

logs :
  # options: silent, error, warn, info, http, verbose, debug, silly
  level : info

遥测

 # telemetry settings (default: shown below)
# self hosted version of plausible.io
# 100% anonymous, used to help improve project
# no cookies and fully compliant with GDPR, CCPA and PECR

telemetry : true

发展

运行本地服务

服务	命令	URL
UI	NPM运行本地：前端	Localhost：8080
API	NPM运行本地：API	Localhost：3000

构建当地的码头图像

./.develop/build

phrame

如何

捐赠

特征

支持的架构

支持AIS

语音命令

UI

隐私

用法

快速开始

Docker Run

Docker组成

启动启动

配置

图像

自动基因

成绩单

Openai

Midjourney

SteStieAi

Deepai

梦

伦纳多伊

时间

日志

遥测

发展

运行本地服务

构建当地的码头图像

相关文章

微信

左子网

QQ交流群